MEASUREMENT IN EDUCATION 


Dome Mi =. 

; UON 
Dept. of Extension Y < 

SERVICE, } 


Measurement in Education 


An Introduction 


A. M. JORDAN 


Professor of Educational Psychology 
University of North Carolina 


New York Toronto London 
McGRAW-HILL BOOK COMPANY, INC. 
1953 


ie’) A 4 
n 
Apex, Be ^. ® 
iet t DER d 
s we 4 
ty 
= à E ud 


MEASUREMENT IN EDUCATION 


Copyright, 1953, by the McGraw-Hill Book Company, Inc. Printed in the 
United States of America. All rights reserved. This book, or parts thereof, 
may not be reproduced in any form without permission of the publishers. 


Library of Congress Catalog Card Number: 52-6540 


vI 
33035 


THE MAPLE PRESS COMPANY, YORK, PA. 


75 


£ ер of Extension 

» : Sei vicos, M 
“бш Ж 
Preface 


There are two points of view extant which have influenced the con- 
struction of textbooks on measurement in education. One of these 
develops logically the history and principles of testing. Samples and 
items of tests are used mainly for illustrating the principles. There is 
no detailed study of particular tests. А second point of view describes 
the tests in detail but places little emphasis on test construction or on 
the more fundamental principles involved in measurement. 

The present text may be thought of as resulting from a combination 
of these two points of view. The thought here is that a great many 
details are necessary to develop the principles which are present in the 
test items. In case after case the principles involved in test construction 
are pointed out to the reader. One fundamental concept, frequently 
illustrated, is that a test score is merely a sample of an individual's 
performance. Because students need to discuss some tests in great 
detail, the critical approach used in this text may make them more 
sensitive to the principles involved. 

Considerable emphasis is given to the testing of reasoning and under- 
standing. Samples of attempts to measure these characteristics are 
introduced even though the tests are tentative and unavailable. They 
furnish an earnest of the direction of future testing development. 

The influence of my teachers, Edward L. Thorndike, Robert S. 
Woodworth, and F. N. Freeman can easily be detected in my treatment 
of measurement. More recently, the publications of the former Pro- 
gressive Education Association have influenced me greatly. It seems to 
me that their methods of test construction as exemplified in Appraising 
and. Recording Student Progress are sound. 

My obligations are many. Publishers of tests have been very kind in 
permitting the use of items, charts, and graphs which frequently have 
been taken out of their context. At appropriate places in the text recog- 
nition is given. Some of my colleagues have also helped by critically 
reading parts of the manuscript and furnishing helpful suggestions 
William H. Peacock has read the chapter on measurement in physical 


education; Charles M. Clark, the chapter on the measurement of the 
vii 


viii PREFACE 


social sciences; Mary Bynum Pierson, the chapter on statistics; and 
Carl F. Brown, the section on reading. My wife Carrie Nicholson 
Jordan has read the entire manuscript and contributed much to its 
clarity of expression and its meaning. My thanks go out to them all. 


A. M. JORDAN 
Cuaper Hirt, N.C. 
August, 1952 


Contents 


PREFACE . n x 5 " i & €» ж % 


1 


PART ONE. PROBLEMS OF MEASUREMENT 


INTRODUCTION . 

Difficulties in Measuring Mental Traits. Results of Developing Units of 
Measurement. Measurement in Guidance. Measurement in Education. 
Summary, Questions and Exercises, Bibliography 

CHARACTERISTICS OF MEASURING INSTRUMENTS 


Internal Validity. External Validity. Recent Trends in Test Validation. 
Vitiating Factors in Validity. Reliability. Administrability. Interpreta- 
tion and Comparability. Economy. Summary, Questions and Exercises, 
Bibliography 

CONSTRUCTING ACHIEVEMENT TESTS 


Constructing Classroom Tests. Essay-type Questions. Short-answer 
Questions. Organization and Arrangement of Tests. Improving the 
Essay Type of Examination. Summary, Questions and Exercises, 


Bibliography 

THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 
Planning for the Testing Program. Development of Achievement-test 
Batteries. Summary, Questions and Exercises, Bibliography 
MEASUREMENT OF READING, SPELLING, AND HANDWRITING 
Reading. Spelling. Handwriting. Summary, Questions and Exercises, 
Bibliography 

MEASUREMENT OF LANGUAGE AND LITERATURE 

Aims and Objectives of Teaching Language. Summary, Questions and 
Exercises, Bibliography 

MEASUREMENT OF THE SOCIAL SCIENCES 


Objectives in the Teaching of the Social Sciences. Measurement of 
Objectives. Measurement of Achievement in the Social Studies. Sum- 
mary, Questions and Exercises, Bibliography 

ix 


vii 


14 


40 


67 


95 


144 


183 


x 


8 


10 


11 


12 


13 


14 


15 


CONTENTS 


MEASUREMENT OF FOREIGN LANGUAGES. 


Objectives in Teaching. The More Measurable Objectives. Tests of 
French. Spanish Tests. German Tests. Italian Tests. Latin Tests. 
Evaluation of Tests of Foreign Languages. Summary, Questions and 
Exercises, Bibliography 


MEASUREMENT OF MATHEMATICS 


Importance of Mathematics in Our Modern World. Tests of Mathe- 
matics in the Elementary School. Tests of Mathematics in High School. 
Summary, List of Tests in Mathematics, Questions and Exercises, 
Bibliography 


MEASUREMENT OF SCIENCE 


Aims and Objectives of Science Teaching. Tests of Science in the Ele- 
mentary School. Tests of Sciences in High School. Scientific Thinking. 
Attitudes and Interests in Science. Summary. List of Science Tests. 
Summary, Questions and Exercises, Bibliography 


MEASUREMENT OF BUSINESS EDUCATION 


Objectives in Business Education. Problems of Testing. Clerical Tests. 
Tests of Clerical Aptitudes. Clerical Achievement Tests. Bookkeeping 
Tests. Content Tests. Summary, Questions and Exercises, Bibliography 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 


Music. Art. Manual Arts. Mechanical Aptitude and Ability, Summary, 
Questions and Exercises, Bibliography 


MEASUREMENT OF PHYSICAL EDUCATION AND HEALTH 


Objectives in Physical Education. Tests of Physical Capacities. Cardio- 
vascular Tests. Tests of Strength. Tests of Posture, Tests of Motor 
Coordination. Achievement Tests. Measurement and Health Informa- 
tion. List of Tests of Health Education. Tests of Information in Physical 
Education. Summary, Questions and Exercises, and Bibliography 


PART TWO. MEASUREMENT OF INTELLIGENCE 
INTELLIGENCE AND ITS MEASUREMENT # 
Development of Intelligence Tests. Individual Tests o 


The Meaning of Intelligence. Summary, 
Bibliography 


f Intelligence. 
Questions and Exercises, 


GROUP TESTS OF INTELLIGENCE 


Development of Group Tests. Primary Mental Abilities. Intelligence 
Tests for Various Levels. Uses of Intelligence Tests. Results of Educa- 
tional Guidance. Uses of Intelligence Tests in Homogeneous Grouping. 
Aids in Making Decisions about Going to College. Uses of Intelligence 


207 


248 


273 


288 


335 


358 


378 


16 


17 


18 


19 


CONTENTS xi 


Tests for Vocational Guidance. Summary, Questions and Exercises, 
Bibliography 
PART THREE. PERSONALITY INVENTORIES 


MEASUREMENT OF INTEREST. . « ж x T ж ж uw 5 4 


Characteristics of Interests. Methods of Discovering Interests. Uses of 
Interest Inventories. Summary, Questions and Exercises, Bibliography 


MEASUREMENT OF ATTITUDES 447 
Measurement of Attitudes. Summary, Questions and Exercises, Bibli- 
ography 
MEASUREMENT OF PERSONALITY TRAITS 465 
Self-inventories or Questionnaires. Validity of Personality Inventories. 
Rating Scales. Summary, Questions and Exercises, Bibliography 
PART FOUR. STATISTICAL METHODS 

STATISTICAL METHODS 499 
Assembling the Data. Summary, Questions and Exercises, Bibliography 

523 


INDEX 


PART ONE 


Problems of Measurement 


(Sc mera; 


t Deptt of Fxcension & 

* Services Fa z 
P sd 

ж Ж 


бшш inal % 


CHAPTER 1 


Introduction 


The process of education includes three major divisions: (1) the 
determination of goals or objectives, (2) the manipulation of materials 
and methods so that these objectives are achieved, and (3) the evalua- 
tion or appraisal of results obtained. In general, it is the function of 
philosophy to decide upon and define in terms of pupil or student 
behavior the outcomes or objectives of education. It is the function of 
psychology to discover the principles of learning and of the nature of 
childhood so that the most efficient methods and the most suitable 
material may be chosen and also so that the objectives may be achieved 
in the most efficient manner. It is the function of measurement to furnish 
such exact information about the outcomes of education that their 
evaluation and appraisal can be made with more certainty and with a 


greater degree of truth. 

In the past, it has b 
determination of objectives and 
and materials to the level of ac 


een assumed that experts were needed for the 
the selection and adaptation of methods 
hievement reached by the child. There 
has been much less concern about the examinations, ratings, and other 
methods of measuring the outcomes of instruction. These latter have 
all too frequently been evaluated by means of hastily constructed 
examinations and quizzes or by ratings which not seldom have been 
influenced by that mixture of many ingredients called the school mark. 
It is also well known that a judgment of value or appraisal is accurate in 
Proportion as it is based on carefully collected information. From the 
days of Starch and Elliott! who sent around a photostatic copy of a 
Beometry paper to be graded by teachers of mathematics, to Hartog’s 
Examination of Examinations,’ in which such divergent marks were 
given to the same examination paper by professional readers of examina- 
tions, there have accumulated masses of evidence showing the inade- 
quacy and unreliability of the ordinary essay examination. Yet this 
form of testing is today perhaps more widely used than any other. 
! Starch, Daniel, and Edward C. Elliott, “Reliability of Grading High School 
Work in Mathematics,” School Review (1913) 21 :254-259. pu 
Hartog, Sir Philip, and E. C- Rhodes, An Examination of Examinations. New 


ork: The Macmillan Company, 1935. 
3 


4 PROBLEMS OF MEASUREMENT 


It is thus clear that appraisals based on information gained from hastily 
constructed tests or from subjective impressions of teachers cannot 
have that element of certainty so necessary in the evaluation of objec- 
tives. It is the purpose of measurement in education to furnish instru- 
ments for measuring more precisely the outcomes of education, to 
the end that the evaluation of them may not be dependent upon 
insufficient and uncertain evidence. 

It is, of course, necessary that the objectives of education be clearly 
defined or else the measuring instruments cannot be constructed. 
The attainment of complete clarity in objectives has been complicated 
by changes and additions to them introduced from time to time. 
Today there is much greater emphasis upon the total personality than 
heretofore. This means the introduction of many new objectives. At 
the present time we hear much about the well-adjusted emotional 
life, the formation of wholesome attitudes, appreciations of the beauti- 
ful, the development of interests, and the over-all picture of moral 
character. As soon as the objectives are clearly defined in terms of 
children's habits, ideals, and other behavior manifestations, measure- 
ment becomes possible. At the present time, for example, there are 
well-constructed inventories of emotional balance, attitude scales, 
tests of art and music, interest blanks, and procedures for measuring 
cheating, lying, and stealing. 


DIFFICULTIES IN MEASURING MENTAL TRAITS 


At first the difficulties of measuring the mental traits of human 
beings seemed insurmountable. There was such a sharp contrast 
between the complexity, let us say, of silent reading and the simplicity 
of linear distance. Even general merit in handwriting, with its elements 
of slant, letter formation, quality of line, spacing, and alignment, 
seemed complex indeed. And yet after much experiment with questions 
and answers in silent reading, for example, there have been secured 
tests which bring out the delicate shades of meaning inherent in the 
paragraph. If a child, then, can answer these questions (which are 
based on the selections read) he has achieved the objective sought in 
reading instruction. Handwriting, too, has yielded somewhat to а 
measurement of its general merit by means of a scale made up of 
samples of handwriting whose quality increases by steps declared 
equal by expert judges. 

А second difficulty in measuring human traits was that of variability 
of the individual measured. Measurers even in the physical sciences 
had shown slight variations. Small differences, for example, in the 
length of an iron bar were caused by changes in temperature, and 
variations in the speed of sound were caused by changes in atmospheric 


INTRODUCTION 5 


conditions, but these seemed trivial compared with the variations 
between “usual” and “best” in a child's handwriting or in the speed of 
reading a paragraph from one time to the next. It was discovered, 
for example, that far less variation in performance took place if the 
subjects could be induced to put forth their best efforts. Small dis- 
tractions, too, were eliminated, and great care exercised in giving 
the same setting to a problem on subsequent occasions so that the 
variations from one test to another have been reduced to a known 
minimum. 

The third problem of determining the zero of measurement, which 
Thorndike raised in his treatment of the fundamentals of measurement, 
has not been solved but has been by-passed. Mental age uses birth as 
the point of reference, so that a mental age of 2 years would indicate 
the average intellectual performance of children 2 years from their 
natal day. Other points of reference have been the mean of a standard 
group such as of all 12-year-olds. If the point of reference is clearly 
defined and well understood by all, the zero, or “just not any,” of a 
trait is not of such great importance. We must remember that ther- 


mometers use both 32 degrees below freezing (Fahrenheit) and freezing 


(centigrade) as reference points, each of which is called zero and both 


of which are arbitrarily taken. 
Not all difficulties of measuring human responses have been as well 


the three just mentioned. The problem of securing 
validity stands oul at present above all others. Validity refers to the degree 
of effectiveness a measuring instrument achieves in doing that which 
it claims or purports to do. These difficulties in securing integrity in 
the instrument concerned appear m achievement tests, intelligence 


tests, and personality inven 

In the area of achievemen 
sampling. If the habits desired, le 
then a test samples judiciously 


resolved as have 


tories. 

1 tests the question is pretty largely one of 
t us say, in reading are clearly defined, 
the entire area. But it is easily per- 
ceivable that this procedure might omit several areas whose under- 
Standing would be highly desirable. In inielligence testing there is no 
agreed-upon criterion against which the test may be projected. If we 
Use teachers’ estimates, then the test 1s better than the criterion. If 
We use teachers’ marks, we are using a criterion greatly influenced by 
daily attendance and personality traits. In spite of the expenditure of 
much energy and effort, this problem of the validity of intelligence 
tests remains partially unsolved. In much worse plight in regard to 
validity are the personality inventories. Let us take that of the neurotic 
inventory, In such an instrument are usually gathered a hundred Or so 
items which are generally regarded as ie mie of emotional malad- 
justment. “Do you daydream frequently? Do you feel miserable most 


6 PROBLEMS OF MEASUREMENT 


of the time? Do you have spells of dizziness?" are samples. If the 
emotionally maladjusted always daydreamed frequently and the well- 
adjusted never; if the neurotic always feel miserable most of the time 
and the normal never; or if only the emotionally upset always had 
spells of dizziness and the normal never—the validation process would 
be a comparatively simple one. But such is not the case. Perfectly 
normal subjects may have now and then any of the symptoms men- 
tioned above. The validity of neurotic inventories remains an unsolved 
problem in the area of measurement. 

Another fundamental difficulty in the area of mental measurement 
is that of developing a unit of measurement which does not vary from 
one situation to another. If such constant units were developed they 
could be added, subtracted, multiplied, and divided with no substantial 
errors. Three of the many attempts to secure constant units will be 
discussed. 

In the first place, Thorndike’s handwriting scale, first published in 
1909, was called scientific because he apparently had discovered a unit 
which was the same on all occasions. To Thorndike a unit was a differ- 
ence between two samples of handwriting which 75 per cent of hand- 
writing experts had perceived. Thorndike adopted the Cattell-Fullerton 
theorem that differences equally often noticed are equal except when 
they are always noticed or never noticed. By applying this theorem 
to samples where the judgment of difference was never unanimous 
he was able to get around the last part of the theorem. Let us take as an 
illustration five samples of handwriting—A, B, C, D, and E. Suppose 
now that these samples were selected from many others because 

75 per cent of the judges said that B has a higher general merit than A; 
75 per cent said that C has a higher general merit than B; 75 per cent 
+ that D has a higher general merit than C; etc. Then the differences 
iini the samples are equal. They are equal because they are equally 
often noticed. In short B-A = D-C or С-В = E-D. But 75 per cent 
is 25 per cent above the mean, and the statistical term which includes 
25 per cent of the judgments above the mean is the probable error. 
The probable error was thus used as a unit of measure. The principal 
ы with this whole procedure is that the truth of the theorem on 
which the method is based has never been firmly established. 

A second unit of measure very frequently used is the mental year; 
which is simply the difference between two consecutive mental ages. 
Mental age, first given a scientific connotation by Alfred Binet in 1908 
in connection with the measurement of intelligence, has come into wide 
use because its meaning is so clear. But the unit “mental year" is less 
constant than the above-mentioned probable error. It has been demon- 
strated that the amount of mental growth varies from one year to the 


INTRODUCTION 7 


next. In general, the unit is large during the earlier years and becomes 
progressively smaller from the years 12 to 20. For example, any good 
intelligence test will distinguish easily between the average 4-year-old 
and the average 5-year-old but only our most refined tests indicate a 
clear difference between the average 12-year-old and the average 
13-year-old. It would seem therefore that the unit “mental year" 
varies in length from one year to the next. 

A third unit of measurement which is probably more constant than 
the two just described is the standard score. McCall, who used this 
unit on the Thorndike-McCall reading test, called it the T-score. In 
constructing this reading test McCall struck upon the idea of using the 
mean of 12-year-olds as a point of reference. It is a well-known fact 
that measures of any unselected group have a tendency to pile up in the 
proximity of the mean and to appear less and less frequently as the 
distance from the mean increases. This arrangement of scores is called 
the normal curve. To obtain a standard score McCall subtracted a score 
from the mean and divided it by the standard deviation of the 12-year- 
olds, This gave a standard-deviation score. Negative scores were 
avoided by assuming a mean of 50. He then measured five standard- 
deviation units along the base line and in both directions from the 
mean. In this manner he had available 10 units along the base line. 
McCall then divided each of the 10 units into 10 smaller units. There 
were thus 100 units, each unit as nearly as possible equal to each other 


unit. 

The use of these equal units can be realized when we understand 
that a child who increases his score from 40 to 50 T-score units has 
Made the same gain as has another child whose score increases from 
80 to 90, These standard scores have been widely used and will be dis- 
cussed further on a later page. 

RESULTS OF DEVELOPING UNITS OF MEASUREMENT 


Granted that objectives of education have been clearly defined in 
terms of student reaction, and instruments which employ adequate 


Units of ement constructed, then there prejadlargo wano Ve 
measur d. Among these, method stands out 


Problems which may be attacke j 
Prominently po mie does the reading of а us NM of 
inte i ' i lop a greater capacity for reading for under- 
Ud Cage e PN AT ng of a narrower field? 


Standing than would the more intense studyi j 
то ым equivalent to begin with in reading capacity, as based on 


Our well-established measuring instrument, are subjected to radically 
ifferent eii am under the same teacher. What is the differential 
effect upon the two groups of these two methods? The answer is 
Straightforward and understandable. That method is better which 


8 PROBLEMS OF MEASUREMENT 


has brought about the greater change on our measuring instrument. 
If a large enough sample were secured to make the findings statistically 
reliable, the judgment could then be made that one or the other method 
was definitely superior for improving the understanding of reading by 
children at the level studied. Mind you, the judgment would not have 
been a valid one had not the objective been clearly defined and the 
measuring instrument validated on the basis of agreement with the 
objective. It is not difficult to see that valid judgments could be made 
as to the efficacy of the size of class, length of the recitation, number of 
books in the library, and the preparation of teachers if the trouble were 
taken to measure each one by means of its degree of attainment of the 
described objective. 

Let us now suppose that in all areas of education objectives were 
clearly defined, and adequate measuring instruments for these objec- 
tives had been constructed, so that degrees of attainment of the objec- 
tive would be immediately reflected upon the measuring instrument. 
Under these conditions guesswork would disappear from education. 
Teachers would be forced to state in terms of pupil reaction what were 
the objectives of each unit of work. These objectives might then be 
referred to a competent committee who could modify them until they 
were satisfactory. А committee now goes to work to construct an 
instrument which would faithfully reflect these objectives. The teacher 
and pupil would find in this instrument great benefits. The teacher 
could see immediately the results of her instruction. The pupil would 
have an incentive unsurpassed. His mark now instead of reflecting his 
activities in a half dozen different areas would indicate simply the 
degree of success attained in a single area. And while he might not be 
po ege bud continue until he had reached an adequate score on the 

А а ж pi at least know where he stood. . . 
inet R situation has been pictured here which exists in 

y a few areas of human learning and human development. It is the 
purpose of this book to describe objectives and instruments for measur- 
Ing them. In some cases tests have been constructed with too little 
Pep to a Sometimes the objectives have been warped 
Pon nt instrument. In many cases the objectives and instruments 

aimed at the same thing. The idea, however, cannot be 
condemned because of the imperfections discovered in the details of its 
execution. 

Fairly considered and applied, this procedure will help lead us out 
of the атса of guesswork in education. Progress comes in every area 
where units of work are clearly defined. In the past, in the present, an 
in the future, improvement in the educative process takes place most 
effectively in areas where objectives have been most clearly defined an 
measuring instruments most carefully constructed. 


INTRODUCTION 9 


MEASUREMENT IN GUIDANCE 

The area which illustrates the uses of valid measures in some areas 
and their lack in others is that of educational guidance. 

The attainment of an individual on a test or examination indicates 
both what he has done and what he will do. If he has succeeded in a 
given time in learning the fundamentals of arithmetic the chances are 
that he will continue to learn that subject at about the same rate. 
Evidently, the score on a good test is indicative of present achievement 
and of future possibilities. For this reason test scores are very useful 
in guidance. Of course, the judgment made about the future progress 
of an individual from the available evidence, cannot be as accurate 
as that one made about the past. And yet, all guidance depends upon 
the accuracy of prediction of human behavior. The more complete 
the record has been up to the present, the better the prediction and 
the better the guidance. For best guidance the total individual must be 
represented. In the past, accumulated records have contained school 
marks in various subjects, scores on reading, intelligence tests, and a 
few other things. They, for the most part, have omitted records of 
interests, attitudes, habits of work, emotional level, adjustment to 
peers and teachers, etc. It can be clearly seen that many desirable 
Objectives are not too clearly defined in the minds of the teachers, nor 
аге there tests or measures on which they can be accurately recorded. 
Motives, drives, attitudes greatly influence the success ог failure of 
individuals. No real guidance can be administered without attention 
to these more intangible traits. Nor can we be satisfied until both 


Objectives and measures are well developed in these arcas. 
dent upon the records of significant events 


. Guidance then is depen c 
ìn an individual's life up to the present time. Anecdotal corda ARE 
Sometimes useful because they show the whole individual in action. 
ut the more precise measures can be made and kept, and the more 
all-inclusive individual records are, the better can the guidance be. 
MEASUREMENT IN EDUCATION 

Well-constructed standardized measurements exist today in three 
агре areas: (1) achievement tests, (2) intelligence tests, and (3) per- 
Sonality inventories and rating scales. 


ACHIEVEMENT TESTS 
Achievement tests are essentially improved types of examination or 
tests which cover an area of learning. Improvement over usual examina- 
tons and tests consists of (1) more careful selection of representative 
items, (2) greater care in item construction, (3) a preliminary tryout 


of the items selected, (4) the establishment of norms, and (5) greater 


10 PROBLEMS OF MEASUREMENT 


accuracy in grading or scoring. Greatest success in constructing achieve- 
ment tests has come about when (1) the objectives have been clearly 
defined, (2) situations have been arranged so that the objectives are 
clearly reflected, and (3) the amounts or degrees of the objectives have 
been indicated in the score obtained. 

Achievement tests may be divided into informal and formal. The 
informal tests, which are far more frequently used than the formal ones, 
are constructed by the teacher. Two types of them have been most 
como (1) the essay test, and (2) the short-answer test. Competent 
teachers have been able to improve greatly both these types. 

The formal or standardized tests are more carefully constructed than 
the informal. Their items are subject to a number of revisions and are 
submitted to several persons who judge their value. The selection of 
items which are common to textbooks or courses of study implies 
that a thoroughgoing canvass of materials and objectives has already 
been made. After all this preliminary work has been done the test in 
its final form is given to a large number of unselected subjects whose 
scores are used to establish the norms and to compute the reliability. 
Good constructors of achievement tests publish enough of the con- 


struction procedures so that competent judges can be certain about the 
test’s adequacy. 


INTELLIGENCE TEsTS 


Intelligence tests attempt to measure capacities for learning, think- 
ing, reasoning, and so on, without regard to the materials involved. 
They would measure general intelligence. Intelligence tests may be 
divided, on the basis of their use, into (1) individual tests, which 
examine one subject at each sitting, and (2) group tests, which can be 
applied to many subjects at one sitting. 

There are many types of individual tests, though the Binet revisions 
are most frequently used at present. Binet’s tests, introduced into the 
United States in 1911 by Dr. Henry Goddard, have had many revisions 
and adaptations to American conditions. All these revisions use the 
mental age as the unit of attainment and divide it by the chronological 
age to compute the I.Q. Another type of intelligence test has made 
its appearance in recent years: the Wechsler-Bellevue. This test, in- 
tended for adult subjects and those above the age of ten, does not use 
the mental age but keeps the I.Q. though slightly altered in meaning. 

Group inteligence tests originated from the dire need to test large 
numbers of army conscripts in 1917. These original tests, objectively 
scored, sampled much of the same behavior tested by the individual 
test. So many group tests of intelligence have been constructed that 
today satisfactory ones are available from 5 years of age to adulthood. 


INTRODUCTION 11 


PERSONALITY INVENTORIES AND RATING SCALES 


In this category are included attempts to measure many dimensions 
of personality. Self-confidence, dominance, introversion, self-suffi- 
ciency, neuroticism are samples. Most of these attempts are based on 
inventories in the form of questionnaires whose questions are usually 
answered with “Yes,” “No,” and sometimes with a “2.” The first of 
these inventories was developed by Woodworth during the First World 
War. It consisted of 116 descriptions of mental symptoms which were 
to be answered “Yes” or “No.” “Have you ever had fits of dizziness?” 
“Do you have a great fear of fire?" “Can you stand the sight of blood?” 
are samples of the questions used. Many other inventories with some 
Modifications have developed from this pioneer attempt. The Cali- 
fornia Test of Personality, the Bernreuter Personality Inventory, the 
Bell Adjustment Inventory, and many others have been standardized. 

Many behavior traits are as yet not included in standardized inven- 
tories. To get some indication of the presence of these traits in children, 
ratings are necessary. In such a set of rating scales as is contained in 
Behavior Rating Schedules! the scales are usually constructed of five 
divisions, each of which is described verbally. For example, the twenty- 
eighth item asks, “Is he sympathetic?” which is to be rated on the 


| PA | E 


following scale: 


Inimical Unsympathetic Ordinarily Sympathetic Very 
Aggravating О ОШЫЙ friendly and Warm hearted affectionate 
“ruel Cold cordial 


The most recent attempts to get at the inner life of subjects in a 
qualitative way are the projective techniques. By presenting materials 
whose meaning is not too clear (unstructured), it is hoped that somehow 
the subject will unfold his inner life and help the observer to understand 

e very nature of his being. The Rorschach inkblots and Murray’s 
Test of Thematic Apperception are good examples. 

Other personality areas are those of interest, attitude, and moral 


chara nks may be thought of as attempts to discover 
Fa iiie = directly related to success in certain 


Ose areas of interest which г é 

occupation: heads scales consist of a series of statements varying 

= lief to complete disbelief in some institu- 

tion, iq an thus check a statement 

ea .Ont EP 

that the аА noblest of our institutions or а most to be 

abominated. Tests of cheating, lying, and stealing are samples of 
attempts a naw more precisely the outcome of moral instruction. 

К Behavior Rating Schedules. Yonkers, N.Y.: 


Haggerty, Olson, and Wickman, 65 
World jg Nei A 1930. Item by permission. 


12 PROBLEMS OF MEASUREMENT 


SUMMARY 


In order to evaluate the outcomes of education, measurement is 
essential. It works best when objectives are clearly defined and are 
understood by both the teacher and the learner. Under these conditions 
graded situations can be arranged so that the extent of achievement of 
the objective can be registered upon them. Measurement is usually the 
introduction of a defined unit into the total. Measurements are useful 
for supplying facts on which better guidance may be based. 

Fundamental difficulties have arisen in connection with the measure- 
ment of mental traits. The variability of human subjects, the com- 
plexity of the function measured, as well as the establishment of 
agreed-upon zero have proved to be difficult to solve indeed. Along 
with these difficulties the proof of the validity of tests, especially in the 
area of personality inventories, remains one of measurement's unsolved 
problems. 

Measurement in all areas of science has been advanced by the dis- 
covery and rigid definition of suitable units which remained the same at 
all times. Mental age, equal-appearing units, and T-scores were cited 
as samples of attempts in this direction. None of these units satisfied 
completely the strict scientific canon of constancy. Perhaps the T-score 
or standard score comes the nearest to meeting this requirement. Areas 
in which measurements have been constructed are (1) achievement 
tests, (2) intelligence tests, and (3) personality inventories, which 
include neurotic conditions, ascendance-submission, interests, attitudes 
and other dimensions of personality. 


QUESTIONS AND EXERCISES 
1. What are the three major divi- 
sions of the process of education? 
2. Why, do you suppose, was the 
measurement of the outcomes of educa- 
tion neglected? 


3. Just how are objectives and meas- 
urement related ? 


prominent place in measurement? Why 
is it so difficult to achieve in intelligence 
and personality tests? 

8. Explain the difficulties in сол- 
structing units of measurement, What 
is the standard score? How is it derived? 


related 9. How can measurement be used in 
4. Distinguish between measure- guidance? 


ment and appraisal. 

5. Describe the fundamental diffi- 
culties of educational measurement. 
What steps have been taken to over- 
come these difficulties? which measurement has been attempted. 

6. Secure an Ayres or Thorndike Name one test in each area. 


handwriting scale and study critically 12. Why should measurement be 
the differences in samples on each scale. made in education? 
7. Why does validity receive such a 


10. Describe some problems in educa- 
tion that might be attacked did we have 
satisfactory measuring instruments. А 

11. Describe the three large areas іп 


INTRODUCTION 13 


BIBLIOGRAPHY 


CRONBACH, LEE J.: Essentials of 
Psychological Testing. New York: 
Harper & Brothers, 1949. 

Соорехоџсн, Frorence L.: Mental 
Testing. New York: Rinehart & Com- 
pany, Inc., 1949. 

GREENE, Epwarp B.: Measurements 
of Human Behavior. New York: The 
Odyssey Press, Inc., 1941. 

GREENE, Harry A., ALBERT М. 
JonGENsEN, and J. RaymMonp GER- 
BERICH: Measurement and Evaluation in 
the Elementary School. New York: 
Longmans, Green & Co., Inc., 1942. 

‚ —: Measurement and Evaluation 
in the Secondary School. New York: 
Longmans, Green & Co., Inc., 1943. 


LixpQuisr, E. Е. (ed.): Educational 
Measurement. Washington, D.C.: Ameri- 
can Council on Education, 1951. 

RrMMERS, Н. H., and №. І. Gace: 
Educational Measurement and Evalua- 
tion. New York: Harper & Brothers, 
1943. 

Ross, C. C.: Measurement in Today’s 
Schools, 2d ed. New York: Prentice- 
Hall, Inc., 1947. 

SMITH, EUGENE R., RALPH W. TYLER, 
et al.: Appraising and Recording Student 
Progress. New York: Harper & Brothers, 
1942. 

Surer, Donar E.: Appraising Voca- 
tional Fitness. New York: Harper & 
Brothers, 1949. 


CHAPTER 2 


Characteristics of Measuring Instruments 


All good measuring instruments have certain characteristics in 
common. These characteristics have been so well developed that they 
may be applied as criteria of effectiveness to any old or new measuring 
instrument. In the area of measurement of achievement the tests of 
the simpler, more observable outcomes of education were the first to 
possess these qualities which later were found to be characteristic of all 
good measuring instruments. For example, Courtis's tests in arithmetic, 
which consisted of addition, subtraction, multiplication, and division, 
were observed to give nearly the same results on successive occasions 
and to include many of the processes involved in the four fundamental 
operations in arithmetic. They had therefore both reliability and 
validity. These same characteristics of reliability and validity were 
shown to apply when the outcomes of education became more com- 
plicated. The measurements of composition, silent reading, and arith- 
metic problems were seen to be more effective when they possessed 
reliability and validity. Even in the most complicated measures of 
ability to reason, of attitudes, of interests, and of good adjustment, 
e ied came when they conformed to these principles of reliability and 
validity. 


From all these attempts at measurement certain characteristics have 
emerged which may be regarded as being of the highest importance. 
The leading cha 


adi racteristics of all good measuring instruments are: 

1. Validity 

2. Reliability 

3. Administrability 

4. Interpretation and comparability 

5. Economy 

Placed first in the list and in every way of first importance is validity. 

VALIDITY 

The most important question to ask about a test which is being 
considered for use is: "Is it valid?" When is a test valid? What is 
meant by validity? Probably a better question would be: “For what is 
this test valid?” If a test indicates a known amount of progress toward 

14 


CHARACTERISTICS OF MEASURING INSTRUMENTS 15 


an objective it is valid for that purpose. In the Courtis Research 
Tests in Arithmetic, Addition consists of adding sets of nine three-place 
numbers. The score is in terms of speed and accuracy. This test, then 
is valid for measuring speed and accuracy in column addition. It is 
not valid for measuring the addition of fractions or decimals or denomi- 
nute numbers. It is valid for a particular purpose. Some have said, 

А. test is valid in proportion as it measures what it purports to meas- 
ure." One author emphasizes our knowledge of what a test measures 
as being an indispensable characteristic of validity: “А test is valid, 
to the degree that we know what it measures or predicts."! There is a 
logical fallacy here, since we might know positively that a test does not 
satisfy its claims. It might be truer to say that a test is valid in propor- 
tion as it measures well what is desired to be measured. The phrase “ meas- 
ures well” implies an empirical trial of the test with an adequate 
sample of subjects and computations to indicate the degree of success 
it had achieved in measuring the desired outcome. If, then, the instru- 
ment which is chosen reflects accurately the degree of attainment of a 
defined objective it is valid for that purpose. To ensure this validity 
careful test builders exert great care (1) in the construction of the test, 
and (2) in correlating it with some external criterion. We might call the 
first of these internal validity; the second, external validity. 


INTERNAL VALIDITY 
Achievement Tests 


Internal validity refers to the care with which the items of the test 
are selected and arranged. The elements which make up a test are con- 
structed after a consideration of the agreed-upon objectives. The items 
judged by a jury of experts, and then tried out 
Upon a small sample of subjects. Ambiguities and misunderstandings are 
Sure to appear in connection with certain items. These items are 
Modified in statement or omitted entirely. Sometimes, even at this late 
date, further revisions are made before the test assumes its final form. 

f our objective were to make the most valid test for an elementary 
algebra class, the teacher would be the best one to do it. He would 
now exactly the areas he had taught, the objectives he had in mind. 

* might analyze the areas into the processes employed and then 
Construct a test which contained samples of all the algebraic processes, 
With each process being represented at three or four different levels of 
difficulty, Tf such a test were carefully constructed it would reflect 
accurately progress in the mastery of the algebraic processes studied 


ате carefully written, 


* Cronbach Lee J., Essentials of Physiological Testing, p. 48. New York: Harper & 


Tothers, 1949, 


16 PROBLEMS OF MEASUREMENT 


and the defined objectives. In such a test the curricular or internal 
validity would be satisfactory. For obtaining the curricular validity for 
this particular subject, this procedure has no rival. 


Frequency of Occurrence 


In contrast to the teacher's test of specific subject matter a standard 
lest over the same area would base its items on subject matter common 
to courses of study and popular textbooks. The procedures used in con- 
structing such a test indicate the method. Let us see how it worked 
in one case. In constructing the literature section of the Unit of Attain- 
ment Test, the literary samples were selected from lists recommended 
by state courses of study. A list of the better state courses of study was 
made for the author by one member of our department, M. R. Trabue,! 
who had at that time been investigating state courses of study. This 
list was then rated by two other competent persons. With this list of 
10 courses of study in hand, prose and poctry selections were made 
which were common to at least 9 out of the 10. Multiple-choice items 
for each selection were then constructed. 

In the construction of other tests, many devices have been used to 
find the most frequently used materials. In one case, a pool of items 
was made from those common to a list of textbooks regularly used in 
that area. In another, questions from a sequence of examinations have 
been inspected. In a series of examination questions, some questions 
in slightly different form occur more than once. These have been used 
as bases for test construction. The use of frequency of occurrence in 
textbooks or courses of study as a criterion tends to neglect local 
materials introduced for interest and to perpetuate common facts in the 
test. Unless the new material introduced into one textbook were incor- 
porated generally into others, it could not appear in the test. At times, 
the undue influence of test items on teachers has tended to discourage 
not only experimentation with the curriculum but also the introduction 
of materials gathered from the locality. 

This implied tendency of a certain type of standardized test to 
"freeze" the content of the curriculum may be largely avoided by 
constructing a test of the more permanent aspects of education. Thus 
the test may not be naively concerned with the mere reproductions 
of facts but may deal with the interpretation of these facts embedded 
in a new situation. In science, for example, this would mean tests of 
understanding scientific method such as the formulation and testing of 
hypotheses or the solution of problems unlike any that had been studied. 
With this type of item the criticism that objective tests encourage 
memorizing specific facts disappears. 


! Now of Pennsylvania State College. 


} 


CHARACTERISTICS OF MEASURING INSTRUMENTS 17 


Judgment of Experienced Observers 


The use of the judgment of experienced observers as a criterion 
against which to measure a test's validity is nicely illustrated in the 
Construction of the Iowa Silent Reading Tests. This was directly 
influenced by A Survey of a Course of Study in Reading.’ This investiga- 
tion listed and analyzed the characteristics which аге ordinarily met in 


typical reading situations (Table 1). 


TABLE 1. READING ABILITIES vs. READING TEST 
Horn and McBroom's List of 
Reading Abilities 
Т. Skill in recognizing new words 


Iowa Silent Reading Test* 
Test 1. Word meaning 
Part A. Social science 
Part B. Science 
Part C. Mathematics 
Part D. English 
Test 2. Location of information 
Part A. Use of the index 
Part B. Selection of key words 
Test 3. Paragraph meaning 
Part A. Science 
Part B. Poetry 
Part C. Political science 
Test 4. Paragraph organization 
Part A. Selection of central idea 
Part B. Outlining 


2. Ability to locate material quickly. In- 
Volved use of index, table of contents, 
dictionary, card files, etc. 

ш Ability to comprehend quickly what 
1S read 


4. Ability to select and evaluate material 
needed 


5. Ability to organize what is read. Test 5. Sentence meaning. (Set of sen- 
Involveq summarizing, ordering of tences of boda. difficulty to 
topics, discovery of related material, be answered by “ Yes or No") 
and о! ГИК" е 

6. Mira cmd of material read Test 6. Rate of selected reading 

Part A. Inreading science mate- 
rial 
Part B. In reading political sci- 
à ence material 


8 Knowledge of sources . 
Attitude of attacking reading with 
Vigor 
` Attitude of proper care of books 
st numbers changed. 
Tf one compares the two columns of the table, one sees that while 
the test follows this analysis of reading pretty closely, it emphasizes 


1 á а А 
Hor n, A Survey of a Course of Study in Reading, 
Ext, “п, Ernest, and Maude McBroom, à ý 5 аа 
1924 Sion Bulletin No. 93, College of Education Series No. 3, University of Iowa, 


18 PROBLEMS OF MEASUREMENT 


more the facility in reading with comprehension than the many other 
uses to which reading is put. Thus the test emphasizes “the ability 
to comprehend quickly what is read" in two selections—one from 
science and one from literature—as well as in the understanding. of 
sentences. Word knowledge is well represented as well as the looking 
up of items in indexes and the speed of reading. Attitudes, knowledge 
of sources, and the proper care of books are omitted. We can see that 
this excellent test of reading is not completely valid. Such a procedure 


for selecting items is less static than the preceding and includes some 
aspects of social utility. 


Social Utility 


It is inconceivable that the criteria thus far presented for selecting 
items for a test should have been entirely devoid of social utility- 
Even if the criteria used implied the presence of social utility its sig- 
nificance must not stop at mere implication. Social utility is then used 
here as a separate criterion although it is related to all the others. 
Here is a test in spelling, for example, which selects its words because 
of their frequency in private correspondence, or another, because of the 
frequency of references in reading, or yet another, because of the 
number of times words are misspelled. At least one test in home me- 
chanics has been based on a course of study which was composed of 
activities engaged in while mending the things around the home. 

But until educational objectives are dominated by the ideal of social 
utility in their formulation, the measurer is helpless. Remember that а 
good measuring instrument is valid only in so far as it indicates the 
degree to which an agreed-upon objective has been reached. 


Psychological and Logical Analysis 


One of the best illustrations of a 
validating criteria appears in the rep 
Progressive Education Association.) 
cisely what is meant by psychologic 
In this investigation clear- 
The 30 participating schools 
the objectives of education 
out. This rather long list 
consolidated into ten objec 

1. Methods of thinking 

2. Useful study skills and work habits 

3. Social attitudes 


1 Smith, Eugene R., Ralph W. Tyler, et al., Appraising and Recording Student 
Progress. New York: Harper & Brothers, 1942. 


slightly different emphasis upon 
ort of the Evaluation Staff of the 
Their procedures illustrate pre- 
al analyses in test construction- 
cut objectives were first decided upon. 
entered into this project by setting forth 
which their respective staffs had worke 


was studied by the Evaluation Staff an 
tives, as follows: 


хабат, a 19 
CHARACTERISTICS OF MEASURING INSTRUMENTS 
t interests 


4. Wide ignifi 
* range of significan i 
И р f music, art, ап 


5. Increased appreciation © 
6. Soci Eam 
cial sensitivity ial adjustments 


y Better d 
ч personal and socia! 
B. Acquisition of important intormaXion. 


9. Physical health 

10. Consistent philosophy of life 

After the objectives were agreed upon, the search began for types of 
materials through which these objectives are expressed. А method of 
scoring the reactions was then worked out so that a more precise agree- 
ment between the defined objective and the evaluating instrument 
could be realized. In the final step, a careful interpretation of the whole 
procedure was developed. In the book just referred to, these three steps 
—(1) finding materials, (2) discovering means of registering accurately 
the reactions of subjects, and (3) checking the objective against the 
results thus achieved—were employed for each of the 10 objectives 
listed above. Here we will summarize only the procedure used in 
evaluating methods of thinking. 

In the first instance, ‘‘methods of thinking" was defined more clearly. 
It was agreed that methods of thinking шаң at least pepe id 

1 ilit interpret data, (2) ability to apply princip'es 7 

Ө) ike Leod el the nature of proof, and (4) ability to formulate 
hypotheses. The ability to interpret data involves (а) ability to per- 
ceive relationships in data, and (b) ability to recognize the limitations 
of data. In this manner each of the four abilities was analyzed ES 
smaller, more understandable parts which could be clearly e 
and whose expression could be observed in selected er а 
validating and appraising such an objective as methods o x EE 
there were no limits to the types of material that could be ч i e 
form, however, must be new to the subject, or else his act would be a 


i i ‘ences and natural sciences offered 
Bim mory. The social scien 
er etai. and so selections were made 


satisf. terial for this purpose, 5 : 
from кез im illustration of the procedure is provided by a sample 
exercise (Problem 1) from Form 2.52: 


These data alone 
tatement true. 


(1) are sufficient to make the 5 ; 
(2) are sufficient to indicate that the statement is probably true. DAE 
(3) are not sufficient to indicate whether there is any degree of truth or falsity in 
the statement. 
(4) are sufficient to indicate that the cun is probably false. 
(5) are: i ke the statement false. 
pae praising and Recording Student Progress, 


‘Smith, E. R., R. W. Tyler, e al, AP cording St 
bp. 52-53. New Vork: Harper & Brothers, 1942. Quoted by permission. 


d literature 


20 


PROBLEMS OF MEASUREMENT 


160 


Volume of form production 


140 


Farm population of 
smployable:ooe Number of farm 


workers employed 
120 


Percent relative to the year 1900 


100.560 1905 1910 1915 1920 1925 


Fic. 1. Problem 1. This chart shows production, population, and employment оп 
farms of the United States for each fifth year between 1900 and 1925. 


Statements 


1. 


The ratio of agricultural production to the number of farm workers increased 
every five years between 1900 and 1925. 


- The increase in agricultural production between 1910 and 1925 was due to тоге 


widespread use of farm machinery. 


- The average number of farm workers employed during the period 1920 to 1925 


was higher than during the period 1915 to 1920. 
The government should give relief to farm workers who are unemployed. 


- Between 1900 and 1925, the amount of fruit produced on farms in the United 


States increased about fifty per cent. 


. During the entire period between 1905 and 1925 there was an excess of farm 


population of employable age over the number of people needed to operate 
farms. 


- Wages paid farm workers in 1925 were low because there were more laborer 


than could be employed. 


- More workers were employed on farms in 1925 than in 1900. 
- Since 1900, there has been an increase 


in production per worker in manufacturing 
Similar to the increase in agriculture. 


Between 1900 and 1925, the volume of farm production increased over fiftY 
per cent. 


Farmers increased production after 1910 in order to take advantage of rapidly 
rising prices. 


- The average amount of farm production was higher in the period 1925 to 1930 


than in the period 1920 to 1925. 


. Between 1900 and 1925 there was an increase in the farm population of employ” 


able age in the Middle West, the largest farming area in the United States. 


. Farm population of employable age was lower in 1930 than in 1900. 
. The production of wheat, the largest agricultural crop in the United State? 


was as great in 1915 as in 1925. 


ps 
Vee 


^ 


CHARACTERISTICS OF MEASURING INSTRUMENTS 21 


From such a test we may secure eight different scores: (1) general 
accuracy, (2) probably true or probably false, (3) insufficient data, 
(4) true-false, (5) omitted, (6) caution, (7) beyond data, and (8) crude 
errors. Items 1, 2, 3, 5, 7, and 8 are self-explanatory. Item 4, true or 
false, gives the percentage of times the subject recognized a true 
Statement as true and a false statement as false. Item 6, caution, refers 
to the withholding of the degree of truth which the makers of the tests 
would allow. In thus producing an analyzed score the test could focus 
the teacher’s thought on the weak and strong points in the student’s 
ability to think. 

If one of the objectives striven for by teachers in instructing high 
school students is the ability to apply principles of science, and if this 
objective is analyzed and areas discovered where the application is 
feasible, then the degree to which the objective has been reached may 
be measured, The teacher can then decide whether or not his teaching 
Procedures have been effective for this purpose, and the student can be 
Properly guided into activities which demand the amount of scientific 
Beneralization achieved by him. This procedure in test construction is 
interesting because the whole process from objective to the evaluation 
of the instrument is set before us. Moreover, the attempt was made to 
develop instruments in areas where no satisfactory instruments already 
existed, Finally, it is instructive to those of us now working on validity 
because the authors really set down validity as the first and foremost 
9f their criteria in the construction of their tests. 


Intelligence Tests 

tests there is no common pool of informa- 
be drawn. Items, in general, are selected 
the common environment, because they 
f subjects with increasing age, or 
d as I.Q.s increase from 90 to 


In constructing intelligence 
Чоп from which questions can 
cause they are drawn from 
are passed by an increasing number o 


cause an increasing percentage 1s passe I. й 
he Stanford Revision of the Binet-Simon 


100 to 1 F E int 
tests, ir pe di ier of children whose 1.0.5 were 110 passed 
€ item than of those with I.Q.s of 90, the item would not be selected. 
ems used in the Terman-Merrill Revision were also correlated with 
€ test as a whole. If the new item did not agree well with a score based 
the total of items, it was eliminated. A different way of selecting 
MS appears in the work of Maurer.’ For a long time it had been 
of life did not well predict 


wn tha . in the early years a ca 
t tests given in Y was able to study the predictive 


Stam. s 
nd in the later years. Maurer oe uar cud vache 
eur i ] 1 Status at Maturity as a Crier l 
Items р 05 Katherine M., Intellectua Sau i i Minnesota BS, MA 


In Preschool Tests. Minneapolis: 
й — 


E лат... ч 
zr EDDA TIO руг 


MMC 
2 Aaa са < У. sr Е 


[^1 


22 PROBLEMS OF MEASUREMENT 


capacity of items of the Minnesota preschool tests by correlating them 
with a group intelligence test given in late adolescence. She demon- 
strated that tests could be selected which would predict later standing 
on group tests of intelligence. Thus a new procedure for selecting items 
was developed. 


A plitude Tests 


Items for aptitude tests have been selected by a psychological analysis 
of the factors involved, as in Seashore’s Measures of Musical Talent, 
or by a correlation of each item with some criterion of success. The latter 
procedure, most generally used at present, appraises what is known а 
external validity. If we were selecting items for a clerical-aptitude test; 
internal validity would demand that each item correlate well with the 
total score. Suppose we should take the highest 27 per cent and the 
lowest 27 per cent from scores made on our Total Test. Any item that 
was passed by a much larger percentage by members of the highest 
group than by those of the lowest group would be a suitable one. 
an item were passed by a larger percentage of subjects in the lower group 
than in the upper it would not be discriminative and therefore coul 
not be used. 

EXTERNAL VALIDITY 


However well a test is prepared there is no certainty of its usefulness 
until it is tried out by comparing it (1) with actual achievement in а 
practical situation, or (2) with other measures of the same area. After 
all, there is usually some measure outside the test itself against which 
this measuring instrument may be projected. These outside measures 
are called criteria. If satisfactory criteria could be established for all 
tests, their validity might be efficiently appraised. 


Achievement Tests 


The criteria against which we attempt to measure achievement 
tests are usually much less effective measures of achievement than the 
tests themselves. One might use teachers’ marks as criteria of success 
but they are compounded of many elements in addition to achievement 
in school subjects. Teachers' ratings of achievement in reading or 
arithmetic make the criterion purer but add to the problem the un^ 
reliability of rating. Achievement tests have rather high correlations 
(.70 to .80) with intelligence-test scores, but these have many othe! 
components than achievement. For this reason, constructors of achiev? 
ment tests are depending more and more on curricular validity. 


Intelligence Tests 


It is also difficult to discover adequate criteria against which to 
measure intelligence tests. One criterion which has sometimes bee” 


CHARACTERISTICS OF MEASURING INSTRUMENTS 23 


ms be ср" rating of three or four competent persons. The 
i я of some hundred children is rated by three persons who 
е А well. The average of these ratings is computed, and with 
d wo the scores of the test are correlated. Another criterion 
med lc Ў group test scores have been compared is the individual 
Es idt Pres a group test of intelligence may be correlated with 
Binet n ual test which has been long established, such as the Stanford- 
ian . ; n one occasion, the author correlated the scores of four group 
Se E with the Stanford Revision of the Binet-Simon 
this fn шош that perhaps the one with the highest correlation with 
Ben d иша test might be a more efficient measuring instrument for 
up e : School marks, in spite of the multiplicity of factors 
ink Sgt enter into their composition, have been used as criteria 
(Ter or achievement tests and for intelligence tests. In one case 
у 1916) the coefficient of .48 was given as existing between the 
ehe. Revision intelligence test and school marks. In general, the 
ан between average school marks and intelligence-test scores 
Tange from .40 to .60. 

Am illustration of validation through statistical procedures may now 
шене. The author! had in mind the determination of the highest 
Gro ity among four group tests of intelligence—Army Alpha, Terman 
mas Otis Advanced, and Miller. In this study each of the four tests 
measured against four important criteria: (1) Stanford-Binet 

S hor ratings of intelligence, (3) school marks, and (4) a com- 
4 ite made up of a combination of all four group tests. Each of the 
ta tudents was tested with all four group tests as well as with the 
nford Revision of the Binet test. The teachers’ ratings of intelligence 


те 5 a 
Presented the average ratings of four critic teachers who knew the 
averaged for each student. The 


йы 
ae well. The school marks were 
Omparisons in all instances were made by means of Pearson’s Coeffi- 


Clent of М 
correlation. 
i mputed with the scores on the 


i 
ү, 2 neighborhood of .68 for t 
in ia These results indicate su 
Clk a is the correlation а 
Bround these four group tests 
betwee common to the Stanford- 
i п any one group test and 
1 the case of teachers’ rà 


do measure 
Binet, but the 
the individual test. 

tings of intelligence, t 


re is an area of unlikeness 
he correlations 


1 
P Jordan, A. M., “The Validation of Intelligence Tests," Journal of Educational 


Sych 
‘ology (1923) 14:348-366, 414-428. 


24 PROBLEMS OF MEASUREMENT 


computed with the group tests ran from .60 to .70. Here again the 
agreement is substantial between group tests and what competent 
persons judge to be the presence of intelligence. 

3. When school marks were correlated with each of the four group 
tests, the coefficients varied around .47, with the lowest being .45 and 
the highest .49. According to these figures intellectual factors measured 
by our group tests entered into the securing of school marks to only 
a moderate degree. The correlations are, however, of about the same 
size as that found for the Stanford-Binet and school marks (r — A8). 

4. Finally, when the group tests were correlated with а composite 
Score made up of all of them combined, there is an entirely different 
size of correlations, for now they are .90 and above. This signifies that 
each group test is measuring about the same characteristics as the! 
combination. The fact that each group test's score was included in the 
composite tended to raise the size of these coefficients. 

In this same article, many of the correlation coefficients which othe! 
investigators had previously computed between each group test 0 
intelligence and the four criteria mentioned above were collecteC: 
For example, between Army Alpha and high school marks 26 coefficients 
were found, and 35 with college marks. The average of these relatio? 
ships between Army Alpha and school marks was .38. In this manne! 
when all correlation coefficients in which a group test of intelligenc? 


entered are collected, a great deal is known about its validity. Truly: 
a test is known by its correlations. 


A plitude Tests 


Aptitude tests have used measures of achievement in a realistic 
situation as criteria to indicate the presence of external validity- 
good illustration of the development of a satisfactory criterion appeal 
in the standardization of the Minnesota Mechanical Ability Test®: 
The criterion which was finally utilized was the quality of mechanic? 
work which students produced in junior high school. This quality sie 
arrived at by direct observation and inspection of the work, by actual 
measurement of the product, and by judging the output by refine 
scales. Time has shown that the criterion was a good one and that the 
time consumed in constructing an adequate criterion was well spe?" 
Throughout this text examples of criteria used will be illustrated whe?” 
ever tests are discussed. 


Recent Trends in Test Validation 


In recent years there have been no fundamental changes in studyin® 
the validation of intelligence tests. There has, however. been extensio"; 
in three directions: (1) one test is studied at a time, (2) the number И 
criteria against which the test is projected has been increased, 9? 


CHARACTERISTICS OF MEASURING INSTRUMENTS 25 


(3) there has been great interest in the validity of individual parts of 
tests. At the present time the validity of an intelligence test is deter- 
mined by its correlations with the following criteria: 

1. School marks. Average school marks and marks in individual 
Subjects are utilized. Sometimes scores obtained from educational 
achievement tests are used. 

2. Other intelligence tests, especially those which have been used a 
long time and about which much is known. 

3. Mechanical, clerical, and artistic ability as measured by tests in 
these fields. 

4. Success on the job. There has been much interest here in con- 
nection with the use of tests in guidance. Success in salesmanship and 
teaching are examples. 

5. Amount of education which individuals have achieved. The 
Correlations are made with the highest grades achieved in school. 

6. Length of time remaining in school or progress through school. 


7. Many other miscellaneous criteria. 
Against ail these criteria are projected both the test as a whole and 


each of its major parts. " | 

ince many of these criteria of validity are considered when the 
Various intelligence tests are treated in this text, a few illustrations 
Only will be given here.! Thus two investigators found that correlations 
between the A.C.E. (American Council on Education) Psychological 
Examination and the school marks of the University of Chicago fresh- 
(biological sciences) to .57 (social sciences).* 


men ra 
x ged Шоп 248 8 to .67 with the Terman- 


is same A.C.E. test correlated from .5 | 
Merri] Revision? Another student computed a correlation of .62 


between the A.C.E. and name checking and one of .26 between the 


d D 4 
-E. and number checking. 


Vitiating Factors in Validity 
The validity of a measuring instrument is sometimes reduced in 
i &ctiveness by impurities which creep either into its content or into 
Ы &dministration. Some of these factors are: 
i i i Super, Donald E. 
Fora, i atment of this topic, see Super, н 
much xhaustive trea | ib А g 
db braising сайры Fitness Chap. VI. New York: Harper & Brothers, 1949. 
See also Seago WE «Prognostic Tests and Teaching Success,” Journal of 
“ducati goe, M. V., à 
i en bo [m me ao MA Comparative Study g тада Week 
SSts Biven ai the e eer icago," Ё tional and Psychological Measure- 
s сең at the University of Chicago, Educa 


“ent (1941) 1:85-92 ; Level,” Л, 

of y L2nuel H T e al., “The New Stanford-Binet at the College Level,” Journal 
со еа ent :705-709. 
1 Su tonal Psychology (1940) 31:70“ ‘chological Examination and Special Abili- 


без» Pet, Donald E., “The A.C.E. Psy 


> Journal of Psychology (1940) 9:221-226. 


26 PROBLEMS OF MEASUREMENT 


1. In some cases a test item which seems to be a good measure of 
one objective, measures another also. An item in an intelligence test 
might conform to all criteria used in its selection but, because it depends 
on reading, would make a poor item for measuring the intelligence of 
slow readers. В 

2. In certain tests of clerical ability, speed is the dominant factor m 
making a good score. Some teachers have so insisted on accuracy that 
when their students took this test of clerical ability so dependent on 
speed, they could not force themselves to speed up. With this group of 
students the test was invalid for measuring rate. 

3. In the Strong Vocational Interest Blank the subject votes L-I-D 
(like, indifferent, dislike) on most of the items. It was thought that 
subjects in the vast majority of cases would use either L or D and 
would use I only when they simply could not decide. Some subjects; 
however, are unable to make affective judgments of either L or D an 
use I on a very large number of items. For this group no clear direction 
of vocational interest can be secured from the administration of the 
blank. 

4. Through experimenting with the true-false technique used i? 
constructing test items it was discovered that students when in doubt 
mark the item “True.” Such items, if true would be correctly marked. 
If false, they would be incorrectly marked and would therefore be # 
more precise measure of the subject’s knowledge. In one study Cron- 
bach? showed that the reliability of his “false” items was .72, that of 
his “true” items, .11. False items, then, were more reliable and more 
useful. 

In short, a variety of unpredictable human factors sometimes pre 


vent the item from measuring those processes for which it was prepare 
and thus invalidate it for the purpose at hand. 


RELIABILITY 


A good measuring instrument must of necessity possess the chat 
acteristic of reliability. Reliability implies precision or accuracy. W hen 
a test possesses high reliability its results vary little from one test t° 
another. It gives nearly the same results on two successive occasioD>: 
Suppose that a child receives a mental age of 6 years and 10 months 
(6-10) on one testing of the Terman-Merrill Revision and 7-0 at the 
next which is given one week later. These are accurate results, and if 
100 pupils were tested on two occasions a week apart and registered 
such small variations for each of the 100 subjects involved, the test 
would be designated “highly reliable." In validity the emphasis is 0% ; 


1 Cronbach, Lee J., Studies in Acquiescence as a Factor in a True-False Test! Я 
Journal of Educational Psychology (1942) 33:401415, 


| 


CHARACTERISTICS OF MEASURING INSTRUMENTS 27 


test's agreement with the objective; in reliability, upon agreement with 
itself. In terms of the oft-worked analogy of linear measurement, the 
yardstick’s validity is determined by its agreement with the standard 
yard in our National Bureau of Standards, its reliability, by its agree- 
ment with itself. A certain board’s length remains at 1634 inches 
through three successive measures, a fact which indicates the measuring 
instrument’s lack of variation (its reliability). Further understanding 
of reliability may be achieved by following carefully the four methods 


Which are used for computing it. 
METHODS ron COMPUTING RELIABILITY OF TESTS 

In three of the four methods the technique used for measuring 
reliability is the coefficient of correlation. 
l. The repetition of the same test. When there is only one form of a 
test, reliability may be measured by the correlation between the scores 
received from two administrations of the same test. Each of 100 sub- 
ossess two scores received on the same test given at 
different times. The reliability would be obtained by computing the 
Coefficient of correlation between them. One can readily see that when 
€ same test is repeated some of the children will remember the items 
rom its first administration and some curious ones will have looked 
"D the answers or asked their parents. One question which always 
arises relates to the amount of time which should elapse between the 
two testings. If only a short time elapses, then the memory factor may 
© quite large; if a long time, the scores achieved are affected by the 
‘mount of growth which has taken place during this period. There is 
also the problem of the variable physical and emotional reactions from 
Опе test to another, since а child who is well oriented on one occasion 
18 in a state of emotional excitement on another because an aunt has 
ed or perhaps because Christmas is in the offing. For all these reasons 


15 test. t ique is now rarely used. 
iam iip same test. If a test has two equivalent 


* The use of two forms of the t hi 
forms with ue ам а і mean, the same variation, and the same 


Jects, say, would p 


n 
8eneral, subjects, because 0 


“stions a imilarity of content between ә 
еа dish fly larger nent rm the second test. Since there is a tendency 


for ll t, the correlation 

all subje : c their scores by a small amount, the 

тоша hor sabes о PENT that was said about the changes їп emotional 

Sy attitudes a interests in se of the test-retest technique is 
А 


е са ] 
1 SO true here. One might say that the reduction of the coefficient from 
00 : Я 
is an indication of the effect of 


chance errors just described, since 


28 PROBLEMS OF MEASUREMENT 


constant errors do not affect the coefficient. Chance errors produce 
changes in a score's position either up or down and thus reduce the size 
of the reliability coefücient. w— 

3. The odds-even or split-half method. This method does not invo i 
the repetition of a test cither in the same form or in a different pe 
In applications of this procedure, after the test is given the items wi 
divided into equivalent parts or tests by placing the correct odd tr ^ 
in one part and the correct even items in the other. If the items of t wi 
test have been well scaled in difficulty in the first instance, two equiva 
lent parts can be constructed. These two parts are now treated as ven 
forms of the same test and the coefficient of correlation compute 
between them. We thus have a reliability coefficient based on a test 
half as long as the original one. How reliable would a test be which 
is just twice as long as the half-tests just now constructed? To answer 
this question we use the Spearman-Brown prophecy formula! 


п 
Tan = —— 
"1+ (n= tn, 
where 7,, is the correlation betw 


een л forms of a test and n parallel 
forms and r; is the reliability co 


efficient. In this case 7, will be n 
which is the odds-even coefficie 
test, being twice as lon 
reliability : 


nt and is assumed to be .80. The whole 
5 as the half, would then have the following 


2(.80) 1.60 

"^7 TXQ-13071897 89 
The total test’s reliability (r,,, 
testing prefer this procedure sin 
level, the ill effects of memory, 
long the period between testing: 
prophecy formula is valid onl 
cover the same ground, are of 


larger than one computed fr 
reliability. 


1 Garrett, Henry E., Statistics in Psychology and Educa, 


tion, 3d ed., pp. 387-391" 
New York: Longmans, Green & Co., Inc., 1947. 


CHARACTERISTICS OF MEASURING INSTRUMENTS 29 


4. Reliability without correlation (Kuder-Richardson technique). А 
newer technique for computing reliability has been developed which 
Tequires only three sets of facts: (1) the number of items in the test 
(п), (2) the standard deviation of the test as а whole (с), and (3) the 
arithmetic mean of the test scores (M;). One formula frequently used 
m the Kuder-Richardson technique is 


n of — npg 


Tü = — - 
ы? сц? 

— arithmetic mean of test scores _ M, 
gu n n 
{=1—ў® 


Suppose we had a test such as the Otis Advanced Intelligence Test with 
12 items, whose standard deviation was 25 and mean 150. Then 
212 625 — 212(.71) (29). _ 93 
wm x 625 


X SRI 
| 
- 
сл 
se 
[ow 
» 
" 
| 
E 
= 


= 1 — 71 = .29 


This formula posits several assumptions which are not always true. 
Пе of these assumptions is that all items are of the same difficulty. 
In 50 far as this is not true and there is variation of difficulty among the 
‘tems, the size of r is reduced. However, Garrett points out that this 
)rmula will give a satisfactory approximation to a test's reliability 
“ven when the test items cover а wide range of difficulty. Two other 
SSumptions. (1) that the item intercorrelations are equal, and (2) 
. at the test items measure essentially the same ability —must be true 
this formula is to give а very accurate reliability coefficient. Its 
řesults are alwavs lower than the other methods, so that the true 
reliability is at тей as high as the опе this method gets. For these 
reasons this | тусе is recommended only when a rough estimate of 
reliability 16 г смагі and when а quick answer is imperative. Since 
густа] factors influence the reliability, the student will observe care- 
бе (1) the procedure used in computing the ел. (2) the 
of Sentativeness of the population, and (3) the standard deviation 


* population used. 
Factors WHICH AFFECT RELIABILITY 
hat human reactions. These responses 
M neasure are А 
я кз Sat ае, ven to the same situation. So much 


Ty gre м к 
at] to time € t 
Rel uder, “бери oye Richardson, “The Theory of the Estimation of Test 
lahn: t., an Я a 
Ы Согу, Psychometrika (1937) 2:151-160. 
Trett, ор. cil., pp. 385-386. 


30 PROBLEMS OF MEASUREMENT 


depends on interest and effort, on physical conditions, on emotions, 
and on thought processes already in progress that even under the best 
conditions there would be some variation from one time to the next. 
Even if the measuring instrument were perfect and the conditions 0 
the testing were ideal in every particular, there would still be variation 
in the subject's responses, а fact which would lower the reliability- 
Whenever reliabilities are reported for a test, it is understood that 
testing was done under good conditions by a person who knew children 
and who knew the importance of carrying out accurately the written 
instructions of the test. 

1. Factors which reduce reliability. We can divide these factors into 
three groups. First, the subject—all those factors which cause variation 
in his reactions reduce reliability or accuracy. Here we have variations 
in motivation, in emotional balance, in physical level, and in thought 
processes already established. Second, the fester sometimes does not 
follow the instructions exactly, is careless about the time allowed for 
each test, does not see that the young child understands the problem 
before the test proceeds, and is not keenly sensitive to the possibility 
of cheating. Sometimes the tester, too, is a “deadpan” who somehow 
or other does not inspire children to want to work. What is desired 
on all tests is the best which subjects can produce. Any variation from 
the best is apt to lower reliability. In the third place, the scorers may 
not be accurate in their scoring. It is so easy to make mistakes in scoring: 
Particularly is this true when the test itself allows the scorers some dis- 
cretion in interpreting the answers. On the Stanford Revision, for 
example, at year VII there is a diamond to be drawn which must be 
judged as passed or failed. In many cases the judgment is easy but often 
there is disagreement among equally competent observers. When words 
are to be written in to complete sentences, such as “_ shoul 
prevail in libraries and churches,” a bright student will sometimes 
suggest a word which was not intended by the test builder and which 
therefore will be interpreted differently by different scorers. 

2. Е actors which increase reliability. The factors which increase 
reliability are, first of all, the opposite of the conditions which reduce 
it: good motivation which extends throughout the test, emotion? 
calmness, careful administration, and effective Scoring. Secondly, the 
lengthening of the test affects directly its reliability. This fact mig” 
have already been inferred from the fact that a whole test is mor? 
reliable than a half of one. Sometimes a test constructor has this 
problem: “My test, which has 75 items and takes 30 minutes to ad" 
minister, has a reliability of .85, but I want a reliability of .95. How 
much longer will my test have to be to secure a reliability of 95? 
Again the Spearman-Brown formula becomes useful, but now we hav? 


CHARACTERISTICS OF MEASURING INSTRUMENTS 31 


to solve for n: 


This becomes 
—" »(85)  —— 
79 71-4 (a- 10.85 


which when solved for x gives 3.5. He would need then a test of 3.5 
times 75 (or 262) items in length and one that would take 1 hour and 
45 minutes to give. While it might be more efficient in this case to work 
9n the internal consistency and structure of the test rather than merely 
to lengthen it, still the importance of the mere length of the test for 
reliability is clearly demonstrated. ae 

3. The range of the subjects. This too affects the reliability. Let us 
take a case where only three subjects were included: one a genius, one 
3n average child, and one an idiot. In all tests the variation of the 
Idiot would never be so great as to exceed the average individual, nor 
Would the average child score as high as the genius or as low as the idiot. 
Та this case the reliability would be represented by 1.00 on the poorest 
ОЁ tests. Make the range great enough and your reliability is practically 
Perfect. But such tests would not be valuable because their scores 
Would vary too much from time to time. What we want is а test which 
will distinguish between subjects closely alike in, say, their intelligence. 
We need a test which reveals correctly the members of a single class 
Where the variation in scores may be small. In short, reliability com- 
Puteq from a population composed of the members of three grades 
Would necessarily be higher than {тот a population drawn from a single 


Srade, Kelley’s formula may be applied: 
Os A Tu. 


КТ 
Where £ = large group 
TE 5 = small group inøle sixth grade with an о, of 10 and a rss 
of, № Sim mat Sealer become if we used three grades with 
Standard deviation of 20? This becomes in the formula: 

10 _ Ni = ru 

20 Vi — .60 

ғ = .90 


‘ical Method, p. 222. New York: The Macmillan 
ici 


1 
со Kelley, Truman L., Statis 
апу, 1923, 


32 PROBLEMS OF MEASUREMENT 


From this discussion it is clear that test constructors should define 
meticulously the variation of the subjects from whom the reliability %/@5 
computed. In general we can say that the variability оу the standardizing 
population should correspond to the variability of the class or grade or 
which the test is going to be used. If discrimination is required for a 
single grade, then the reliability should be computed from a population 
composed of members of a single grade. In Pintner’s Verbal Series 9 
Intelligence Tests the intermediate scale's reliability is computed from 
children within the age range of one year. This is an excellent illustra- 
tion of correct procedure. 


INTERPRETATION ОЕ RELIABILITY COEFFICIENTS 


The practical questions arising out of the previous discussion ат 
“What do these coefficients mean?” and “How large must the сое“ 
cient of reliability be to be satisfactory?” In the first place, the answe! 
to the question depends on the accuracy required for the purpose 2 
hand. If one wants merely to distinguish between two groups of in- 
dividuals, a reliability of .50 will be satisfactory, but if he wishes t? 
distinguish between individuals in such a way that the score indicate? 
an accurate estimate of an individual's present status and some indica- 
tion of his future achievement, the correlation indicating reliability 
must be much higher. In this latter case, the coefficient should be above 
.90, as much above as we can get. Some of our best achievement tests 
have coefficients as high as .96 or .97. The reliability of the Terman- 
Merrill Revision for all 1.0.5 computed above the age of 6 years 15 
93; for the feebleminded the reliability is .98 (the highest reported): 
It is not at all unusual for intelligence tests or achievement tests con^ 
structed in recent years to report a coefficient as high as .95. This 18 
true when due regard has been paid to the variation of the subject? 
used in the computation. In these two areas one should not choos? 
a test whose reliability is below .90, certainly not for use in schoo! 
In the areas of interests, attitudes, neuroticism, ratings, etc., one ha$ 
to be satisfied with instruments which are not quite so reliable. f 

From the reliability coefficients one may calculate the efficiency % 
one form of a test in forecasting scores on another form. The coefficie? 
which is used to calculate this relationship has been called the coc ficit!" 


of forecasting efficiency. 
Е = (1 — Vi — 75100 


If = .85, then Ё = 47 per cent 
Jf r = .90, then E = 56 per cent 
If = .95, then E = 68 per cent 
If r — .98, then E — 80 per cent 
If r = ,50, then E = 13 per cent 


ll 


CHARACTERISTICS OF MEASURING INSTRUMENTS 33 


Notice the difference in efficiency between a reliability of .90 and one 
of .95, an increase of 13 per cent. Probably most surprising of all is the 
difference in efficiency between a reliability of .95 and one of .98. This 
increase in reliability of .03 is accompanied by an increase in efficiency 
of 11 per cent. 


Probably the most practical and perhaps the b 
all arises out of the concept of the variation of the obtained score. 


After all, we want to know how much confidence we can place in the 
individual score. In brief, if a subject were tested 100 times on this test 
until his true score were obtained, how much is his present score likely 
to vary from this true score? The formula used for this calculation is: 


est interpretation of 


gi, = 0i М! - ru 


error of an obtained score, 71; is the reliability 
Coefficient, and e, is the average of the standard deviations of the two 
forms, Suppose that the score of an individual on a 100-item test is 
60, the о, is 7, and the coefficient of reliability is .85. Then 


т М = 85 
т х .39 
= 2.18 


Where сы is the standard 


\ 


Tis 


\ 


or 3— in round numbers. We now apply this 3 to the score on the test, 
60 + 3. This means that the chances are 68 in 100 that the true score 
lies between 57 and 63 and more than 99 in 100 that the true score 
les between 51 and 69; i.e., between 60 and three times its standard 

| times its standard error on 


tor on i «tween 60 and three 
one side and between i or 
d Other.! Let us take an actual case from the Terman-Merrill Revision. 


Sf xen ined score is 5.24, or 
Or an Т.О), of 130 the standard error of an obtaine Я 
in n died If we apply that to 130, then we get 130 & 5. 


е : the true score lies between 125 and 135 
chances are 68 in 100 that ta e lies between 115 and 145. This 


"LOT at the true Scor j ] 
Seeming Ee at first soon disappears and with practice we 
: ink to ourselves “Score 85 standard error 10—not so good,” or 
| Е à » ге know that the varia- 
Core 85, standard error 3—good," because We н B 

‘on in the last instance is s™ indeed. The puts ram э һе 
anford Achievement Test is We thus say. RM йы ш 
T “cational age of 8 years and 6 months (8-6) plus or cioa : s. 
8.4 Chances are 68 in 100 that Mary's true educationa m is jani 
8. nd 8.8 and 99 in 100 (practical certainty) that it lies between 

and 9, 


urve which includes 68.26 per cent 


1 
99.73 per cent between + 30. 


Т 
bet ese figures are derived fro 
En + 10, 95.44 per cent betw 


m the normal c 
een +20, and 


34 PROBLEMS OF MEASUREMENT 


ADMINISTRABILITY Р 

Another characteristic that all good measuring instruments have 15 

ease of administrability. Under this category may be included (1) ease 
of giving and mechanical make-up, and (2) ease of scoring. 


EASE or GIVING 

Ease of giving depends upon the adequacy of instructions. Good 
instructions should be prepared both for the tester and for the subject- 
Clear-cut directions are necessary for the tester which are beyond those 
intended for the subject. The tester needs to know the directions for 
each part of the test, what is the total possible score for each student; 
and above all, precise time limits. For example, sometimes the teste” 
must read aloud to the subjects while they follow along reading silently- 
Does the total time allowed include this reading, or does it begin fron 
the time the students actually start their work? The instructions shoul 
be so clear on this point that there could be no confusion. Some tests 
allow only 5, 10, 15, or 20 seconds for a single item. It is very difficult 
to time these short items correctly unless the tester uses a stop watcb- 
Most recent tests have the instructions which are to be read aloud іл 
heavy print and explanations for the tester in light print. This is а 
distinct advantage. Furthermore, the general make-up of the test; 
such as printing and paper, affects the ease of giving. If a word to be 
defined is supposed to stand out through being printed in bold letters; 
then not to have these bold letters is a distinct disadvantage. 

The instructions to subjects should in general be more detailed and 
explicit the younger they are. However much instruction is neede 
to make the problem clear to the subject, just that much is necessary: 
Adequate instructions usually include (1) a statement of what is t? 
be done, in clear unmistakable English; (2) one or two illustrations: 
correctly marked; and (3) an opportunity for the subject to try his 
hand in doing a simple exercise. Some tests such as the Nation? 
Intelligence Tests probably went too far in the use of these so-calle 

fore-exercises," but others have not gone far enough. It is clearly poor 
procedure for a group of children or students to start out working 9” 
problems whose very nature is vague to them. Good testers like to 
take the time to glance at these fore-exercises to make sure that th® 
subject has them right, before proceeding with the test proper. 

In some tests, and perhaps increasingly in the future, detached 
answer sheets are used. Instructions under these conditions must b 
carefully and slowly given. With grades below the seventh there 1 
considerable doubt about the efficacy of using detached answer sheets: 
Most certainly this doubt is increased if the children are not accusto™® 
to being tested, i.e., are not “test-broken.” 


CHARACTERISTICS OF MEASURING INSTRUMENTS 35 


Tue EASE OF SCORING 


Anyone can see that under the best conditions a considerable amount 
of work is going to be required to correct test papers. Especially is this 
true if there are some items difficult to score because of some slight 
ambiguity. Objectivity in scoring makes for ease of scoring. If the test 
is composed of completion items, then the acceptable answers must be 
clearly listed. A large number of clever devices have been developed 
which facilitate scoring. Among these, window stencils were among 
the first to be used. Cutouts on a cardboard permit the scorer to see 
only the correct items and he has only to count them up. The Clapp- 
Young self-scoring device uses duplicating paper with each test. Holes 
are so cut that if the subject gets the item right his cross is registered 
on the scoring sheet underneath. All that is necessary to score a paper 
is to count the number of crosses which fall into squares. This procedure 
makes for both speed and accuracy. Most rapid of all is the new elec- 
trically operated scoring machines. About all the corrector has to do 
after the machine is set up is to copy down the answer. Such a machine 
Scores in a few seconds 150 items with only a small variation in accuracy. 

At the present time these machines are very expensive and cannot 
be owned by the small school. They require also a special type of pencil. 
Other mechanical arrangements have been tried out, but most are being 
Superseded by this new electrically driven device. 

Even the objective short-answer classroom tests may be scored 
much more easily if the subjects arrange their answers neatly in a 
Vertica] column or better still are furnished with answer sheets which 
ате all alike, By placing the correct answers boldly written on strips 
of cardboard, the scoring is improved in both speed and accuracy. 


indow stencils may also be easily prepared for this purpose. 


INTERPRETATION AND COMPARABILITY 


A striking difference exists between а standardized test and one 
Constructed for an ordinary class. This difference consists largely ina 
ifference of opportunity for interpretation. The standardized test 
Would not be so called unless its orms, reliability, and validity had 


already been determined and published in the manual accompanying 
* test. Percentile, age, and grade norms are most frequently used. 
> @сепіе norms give Ше scores which marked the percentile points 
In the group used in the test's standardization. They are easy to inter- 
Pret and have the advantage of giving reference points at many levels. 
he standards for the first tests to be constructed consisted only of the 
to see the importance of having 


*dians or 50th percentiles. It is easy à t 
© Sth, 10th, 15th, ..-> 55th, 60th, etc., percentiles as points of 


36 PROBLEMS OF MEASUREMENT 


reference. Thus we can say that Sue, who is in college, scores at the 
25th percentile among 700 college students. 

Some authors have recommended highly that grade and age standards 
be published for (1) city, (2) town, and (3) rural areas, since these 
differ among themselves. The rural child is noticeably handicapped on 
tests dependent upon reading and vocabulary for the scores received. 
These separate norms are certainly desirable but might impose too 
hard a task on the test maker. In lieu of this, some facts could be 
published in the manual indicating the usual differences on this test 
between rural and city children. There is something to be said in favor 
of one norm from which children deviate because of their unusual 
environment. Local norms established in a single school system are 
very helpful. Suppose that the fourth, fifth, sixth, and seventh grades 
of a certain school system scored usually 4.4, 5.3, 6.4, and 7.5 respec- 
tively on the Metropolitan Achievement Test at the end of the year 
and that these scores had become pretty well established. They then 
could be used as local norms. As a consequence, when the children 
of a fifth grade under a new teacher scored only 5.2 at the end of the 
year the administration need not be unduly alarmed. Or again suppose а 
child transferred from a neighboring state or school scored 4.4 at the 
beginning of the year. He could be placed immediately among his peers 
in the fifth grade, while if the administration went by the national 
norms he would most likely be placed in the fourth grade. Standard 
norms and local norms are essential for good inter pretation of test scores. 

If the derived scores are used instead of the raw scores which ате 
obtained from the test, a published table should transmute the raw 
score immediately into a standard score. The most convenient place 
ol lire n "à s bottom of the page. In Pintner Intermediate 
rome of the Uh ату А тана: ops appears 2i E 

t abu'ary test. If a subject scores 13 points the “13” is 
quickly checked in the table and the standard score, 158, carried 
forward to the front of the test to be used in deriving M.A. or 1.Q. 


Raw score t | > 


JE Ad b 8 | 9 | 10) 11 
{сым 

Standard score (103 108 | 113 | 118 122 |126 | 131 | 136 140! 144 148 
Raw score | 12 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 2 23 
Leo 
Standard score 153 158 | 163 | 169 | 176 182 | 188 | 196| 204 212 | 219 | 227 


In high school, norms derived from the number of months a subject 
has been studied are useful. Thus we may have a three-month norm; 
a six-month norm, and a nine-month norm. These monthly period 


CHARACTERISTICS OF MEASURING INSTRUMENTS 37 


norms could be supplemented by percentile norms at each of the three 
periods. 

Ease of interpretation is also facilitated by having the reliability and 
validity clearly established and by having really equivalent forms. 
The manual should state the size of the coefficient of reliability, the 
number of subjects involved, and the mean and variability of the 
population used in the standardization of the results. Good manuals 
also are clear about the validity of the test, both curricular and sta- 
tistical. In this manner, if a reading test samples closely the reading 
experiences of children in the fifth grade, let us say, then when children 
Score well on this test we know that they are achieving the objectives 
which are desired. Equivalent forms aid greatly in interpreting the 
th made by children over a designated period of time. 


amount of grow g 
tisfactory test given to an individual 


They help also when an unsa 
child has to be confirmed or denied. 
ECONOMY 

, In most school systems there is great need of economy in administer- 
Ing the testing program. Three types of economy may be mentioned: 
(1) cost, (2) students' time, and (3) teachers' time. Tests which require 
as much as 75 cents per pupil are far beyond the funds available for 
testing in many schools. On the other hand, many of the best group 
tests may be purchased for 6 to 9 cents apiece. The best tests are not 
always the most expensive. Some of the more expensive tests are some- 
times desirable because their length makes possible the more effective 
Measurement of a complicated objective. Separate answer sheets are 
also designed both for cheapness and for easy scoring. It is also evident 
that if too much of a student’s time is required for testing too little is 
left for learning. There is also danger of creating a sullen, negative 
attitude in students if tests are too long and too involved. In the third 
Place, teachers cannot be expected to stay after school and correct 
Ong complex tests. All that was said about the economy of administra- 
‘on applies here. Matters of cost, student time, and teacher time must 
all be considered in planning for any adequate program of testing. 
SUMMARY 
testing instrument may be divided into 
(2) reliability, (3) administrability, (4) in- 
my. Validity is divided into internal or 
icular validity is directly related to the 

Јесу, В d attempts to discover types of responses 

ich irs еркек là the objective, to finda way to quantify them, 
and to evaluate the responses in terms of the objective. Test con- 


Characteristics of a good 


38 PROBLEMS OF MEASUREMENT 


structors in the past have proceeded along practical lines. Items e in 
educational achievement tests have usually been selected because t Hi 
are common to several textbooks or courses of study, appear in we. 3 
constructed examinations, have proved their social utility, or agre 
with outcomes of education which a staff of experts has agreed upon. 
External validity compares the evaluating instrument with other 
measures of the same objective or outcome. Thus a group intelligence 
test may be correlated (1) with an individual intelligence test, (2) with : 
composite of group tests, (3) with teachers' ratings of intelligence, an 
(4) with school marks. Reliability refers to the accuracy of the instru- 
ment, its freedom from chance variation. Four methods of computing 
it are presented: (1) test-retest, (2) Form A with Form B, (3) the odds- 
even technique, and (4) the Kuder-Richardson technique. Reliability 
was shown to depend on the length of the test, the dispersion of the 
population, and the efficiency of the test's administration. Admin- 
istrability refers to all those procedures of giving and scoring which 
affect the efficiency of a test. Instructions for giving and scoring must 
be clear and unambiguous. Devices for rapid accurate scoring must be 
furnished. The paper on which the test is printed and the mechanical 
make-up of the test are also items affecting the administrability of а 
measuring instrument. 

Interpretation depends upon the care with which the norms are 
established. Age norms, grade norms, and percentile norms are most 
frequently given. If norms are given in the form of standard scores 
then transmutation tables should be readily available. Economy of the 
pupils’ time, the teacher's time, and the cost involved are also practical 


considerations which must be heeded in the selection of any educational 
measuring instrument. 


QUESTIONS AND EXERCISES 

1. What is the significance of the 
question “This test is valid for what?” 
2. How have test constructors at- 
tempted to secure tests valid in content? 


6. Describe the procedure used bY 
the author in validating group tests 9 
intelligence. 


7. How is reliability computed? 


3. Explain and illustrate the relation 
between (a) social utility and test valid- 
ity, and (b) psychological and logical 
analysis and test validity. 

4. Illustrate in some detail a test 
based on psychological analyses, Why is 
such a procedure difficult to carry out? 
Is it worth while doing? 

5. Explain the function of the crite- 
rion in securing test validity. What 
criteria have been used? Are they satis- 
factory? Explain. 


How can the standard error of а scor? 
be looked on as a measure of reliability 
Explain. Given a score of 90 with ? 
standard error of 6. Interpret. е 

8. What factors affect reliability? 

9. What is the variability of t 
population on which the test is ИЕН 
ardized of such great importance? h 
is the best age range to use in standa” 
izing a test? How would the use of 


narrow age range affect a t€? 
reliability? 


js 
ts 


CHARACTERISTICS OF MEASURING INSTRUMENTS 39 


_ 10. How is the coefficient of forecast- 
ing efficiency useful in explaining meas- 
ures of reliability? Illustrate. 

11. What functions have fore-ex- 
ercises in the administration of a 
test? Explain the need for adequate 
instructions. 


12. How can scoring be made more 
economical of time? 

13. For what purpose are derived 
scores used? 

14. On what factors do interpretation 
and comparability of a test depend? 
How can they be made more effective? 


BIBLIOGRAPHY 


Books 


Bincuam, W. V.: Aptitudes and A pli- 
tude Testing, “Selection of Tests,” PP- 
in. New York: Harper & Brothers, 

Скомвлсн, Lee J.: Essentials of 
Psychological Testing, pp. 48-83. New 
York: Harper & Brothers, 1939. 

Слввктт, Henry E.: Statistics in 
Psychology and Education, 3d ed., Chap. 
XII, pp. 380-403. New York: Long- 
mans, Green & Co., Inc., 1947. . 

Guitrorp, J. Pe Psychometric 
Methods, pp. 417-418. New York: Mc- 
Graw-Hill Book Company, Inc., 1936. 

Horn, Ernest, and MAUDE Mc- 
Broom: A Survey of a Course of Study 
™ Reading, Extension Bulletin No. 93, 

ollege of Education Series No. 3 

niversity of Iowa, 1924. | 

Кишкү, Truman L.: Statistical 
Method. New York: The Macmillan 
Company, 1923. 

Remmers, Н. H., and N. L. Gace: 
“ducational Measurement and Evalua- 
Hon, Chap. X. New York: Harper & 
Brothers, 1943. 

Ross, C. C.: Measurement in Today's 
Schools, 2d ed., Chap. III. New York: 

Tentice-Hall, Inc., 1937. 

MITH, EUGENE R., RALPH W. TYLER, 
T al.: Appraising and Recording Student 
tbe . New York: Harper & Brothers, 


T TERMAN, L. M.: The Measurement of 
Nlelligence, р. 55. Boston: Houghton 
їп Company, 1916. 
M... and Mauve А. MERRILL: 
ёазиғіпр Intelligence, PP- 9, 12-21. 


193192: Houghton Mifflin Companys 


Articles 


ALLEN, MILDRED M.: “Relationship 
between Indices of Intelligence Derived 
from the Kuhlmann-Anderson Intelli- 
gence Tests for Grade I and the Same 
Test for Grade IV,” Journal of Educa- 
tional Psychology (1945) 36:252-256. 

Broom, BENJAMIN S.: “Test Relia- 
bility for What?” Journal of Educa- 
tional Psychology (1942) 33:517-526. 

CRONBACH, LEE J.: “Test ‘Reliabil- 
ity’: Its Meaning and Determination,” 
Psychometrika (1947) 12:1-16. 

GUILFORD, J. P.: “ New Standards for 
Test Evaluation,” Educational and 
Psychological Measurement (1946) 10: 
255-282. 

GurTMAN, L.: ^A Basis for Analyzing 
Test-Retest Reliability," Psychometrika 
(1945) 10:255-282. 

Joran, A. M.: “The Validation of 
Intelligence Tests,” Journal of Educa- 
tional Psychology (1923) 14:348-366, 
414-428. 

Корек, G. F., and М. W. RICHARD- 
son: “The Theory of the Estimation of 
Test Reliability,” Psychometrika (1937) 
2:151-160. 

Lanois, C., and S. E. Karz: “ Valid- 
ity of Certain Questions Which Purport 
to Measure Neurotic Tendencies,” Jour- 
nal of Applied Psychology (1934) 18: 
343-356. 

Scares, Doveras E.: “Unit Costs in 
the Administration of a Standardized 
Test,” Educational Research Bulletin 
(1937) 16:38-45. 

SrAncH, DANIEL, and E. C. ELLIOTT: 
“Reliability of Grading High School 
Work in Mathematics,” School Review 
(1913) 21 :254—259. 


CHAPTER 3 


Constructing Achievement Tests 


The construction of tests and examinations is important both from 
the standpoint of understanding the more formal standardized tests 
and from that of evaluating the results of instruction. The number 
of informal tests given far exceeds that of the standardized printed 
variety. One estimate has it that the ordinary teacher gives eight tests 
of his own to one of the commercial variety. It is consequently of great 
importance that the classroom teacher know how to check up most 
efficiently on the educational progress of his pupils. 

As was pointed out in Chap. 1, there are at least three aspects of 
the learning process which throw light on our test construction: (1) the 
definition of objectives, (2) the provision of the pupils with those 
experiences whereby the goals or objectives are achieved, and (3) the 
measurement of the results obtained in order to know to what extent 
the goals have been reached, the objectives achieved. Each of these 
procedures modifies the others. If the objectives are reached, then the 
teacher can be satisfied that his objectives are achievable and that the 
procedures utilized in collecting and arranging materials by teacher an 
pupils have been satisfactory. On the other hand, if many of the pupils 
have not achieved the objectives decided upon, then both procedures 
and objectives need to be studied and possibly modified. Without this 
final process of evaluation and measurement, futile objectives and in- 
adequate experiences continue and tend to become hardened int? 
custom. 

In this chapter there will be a discussion of the construction of short- 
answer, easily scorable, objective types of test as well as of the essay 
type of examination. A complete treatment of these topics with ade- 
quate illustrations would require a volume in itself. If the student will 
master the contents of this chapter and then study the types of test 


construction used in standardized achievement tests, he will be able t° 
construct satisfactory tests of his own. 


CONSTRUCTING CLASSROOM TESTS 


The proper construction of classroom tests depends, in the ns 
place, upon a detailed statement of the objectives to be achieve” 
40 


CONSTRUCTING ACHIEVEMENT TESTS 41 


The objective agreed upon determines the type of examination to be 
constructed. In general, objectives include (1) facts, information, and 
skill; (2) techniques and methods; (3) types of mental processes "dh 
as the capacity to interpret data and to collect and organize it; and 
(4) certain attitudes, ideals, interests, and values. When these objec- 
tives have been carefully defined they must then be analyzed into 
objectives which can be achieved in a certain length of time. The teacher 
now has to decide which type of test most nearly indicates the achieve- 
ment of that objective. If a number of good test items could be formu- 
lated as the learning takes place, much strain and effort would be saved 
near the end of the course and more effective evaluating instruments 


would be constructed. 
ESSAY-TYPE QUESTIONS 

for a student to gather his thoughts from a well- 
Stocked memory, sift them out, and apply them intelligently to the 
topic at hand is to display most effectively his educational attainments. 
Such an answer would be in response usually to an essay-type question 
introduced by “discuss,” "describe," “explain,” “compare,” or 

indicate.” Had there been substantial agreements among those who 
attempted to score such attempts on the part of the student, there 
Probably would not have arisen the movement for short-answer 


questions. 

_ One of the clearest cases of the weakness of the essay type of examina- 
tion occurred in a study in England.’ This case is especially noteworthy 
cause those who graded the examinations were expert graders whose 
Main business in life was allotting marks to papers sent in to a central 
осе. The same 48 English papers wore graded independently by seven 
of these graders, with the following results: 


Theoretically, 


Examiner | Fail | Pass Credit | Special credit 
A 1 16 27 4 
B 0 2 34 12 
(e 7 30 11 0 
D 0 9 36 3 
E 5 16 21 0 
F 2 Y 37 2 
G 19 12 17 0 


aote, if you will, the difference in the number of failures allotted by 
ese experts, While С fails 19, B and D fail not a single paper. At the 
Нано, Sir Philip, and Е. С. Rhodes, An Examination of Examinations, p. 20. 

ork: The Macmillan Company, 1935. 


42 PROBLEMS OF MEASUREMENT 


other end of the scale B gives 12 special credits out of the 48 papers, 
while equally competent C, E, and G give none at all. Look again at 
B and C. B's marks lean heavily toward the higher end; C's toward the 
lower end. It is thus clearly seen that the mark a paper receives depends 
significantly upon the grader into whose hands it falls. А 

There ате certain surmountable difficulties іп the essay-type question 
which must be met if the agreement of the graders is to be increased: 
Among these are the different values placed upon certain aspects 0 
topics, disagreements about counting off for misspelled words, an 
oddities of grammar which can be provided for by consultation among 
the graders. Certain other difficulties are more difficult to overcome- 
Four of these more serious ones are the following: А 

1. That type of answer to essay questions known as padding 15 
usually made up of gleanings from general reading and conversation. 
These rather glittering generalities may be woven together into ® 
fabric composed of truth, half-truth, and downright error. How shou 
such a discussion be judged—fail, poor, fair, or average? There is nO 
doubt that considerable credit is sometimes given for just such а? 
answer. 

2. 'The discussion takes a direction not contemplated by the con- 
structor of the test. A student may honestly interpret a question to 
be answered in one way while its writer intended it to be answered ye 
another. This may be due to the lack of precision exercised in the item? 
construction. Suppose the answer is undeniably good, though not P 
the direction intended—how should it be graded? Further discussi?” 
concerning the manner in which the essay question itself may 
improved appears on pages 59 to 63. 

3. The grammar is satisfacto 
presentation. There are stude 
logical arrangement of a test i 
question with clear- 
an answer with al 
logic, and yet muc 

4. In the essay 
to be rather narr 
all that can be w 


ry but there is a lack of logic in га 
nts whose ingenuity in tangling up t e 
5 truly a masterpiece. By the side of on ў 
cut sequence and excellent integration, will арр, 
most no sequential arrangement, no discoverab 
h of the material presented is factually correct. 
type of examination or test, the sample is compe 
ow. Four or five topics out of the 15 or 20 are abo" 
ell discussed in a test of 2 hours. The student тау 4 
better prepared on the topic not discussed than on the one include 
in the test. For these reasons the conscientious teacher who truly 


desires that marks and grades þe genuine indicators of educatio”? 
achievement finds himself frustrated. 


Because of the reasons just describe: 
examinations have generally shown both 1 
Teachers of the same subject were una 


d 
d, essay-type questions 17 
ow reliability and low validi a 
ble to agree on a mark 107 


CONSTRUCTING ACHIEVEMENT TESTS 43 


single paper photographed and sent to them. For example, in one of 
the earliest studies a photostatic copy of a geometry paper was sent 
to 116 high school mathematic's teachers to be graded.! The scores 
ranged from 28 to 92. This case is doubly interesting because there was 
по padding, no misunderstanding of the question, and no particular 
problem of counting off for poor spelling or bad grammar. These pro- 
cedures were repeated in English, social science, and other subjects. 
In the second place, wide variations in the percentages allocated to 
each school mark frequently occurred even in the same school, so that 
the number of failing marks varied from 0 to 15 or 20 per cent while 
there was a corresponding variation in the percentages allocated to 
other marks. It seemed that the mark a student received depended 
almost as much upon the instructor he had fallen heir to as upon the 
Progress toward the defined objective. As a result of many such studies 
es with ordinary school tests and examinations, 


of unfortunate experienc 
development of short-answer questions.? 


there was a rather rapid 


SHORT-ANSWER QUESTIONS 

Short-answer questions are intended to be framed in such a way that 
the crux of the matter, the base on which the whole answer rests, is 
the answer to the item. In translating a sentence from a foreign language 
Into English the exact translation sometimes turns on the meaning of 
9ne word, Could the meaning of this word be discovered, the jam would 
be broken and the thought flow on without interruption. If this word 
is not known the translation is limping and ineffectual. The builder of 
Short-answer tests welcomes such a word. He embodies it in a multiple- 
Choice test or in a completion test. Now he no longer has to try to dis- 
hole paragraph. In this case the correct 


entangle the translation of the w : 
ee e im achieved without padding, without new direction to the 


Iscussion, without logical difficulties and without inadequate sampling. 


amplin tisfactory because many more items can be 
В more satis Re ia 
Bing can be in the essay examination. There are 


С и шке testing: (1) those based on recall, 
and (2) those based on recognition. 
SHoRT-ANSWER Trsts BASED ON RECALL 

в types of tests based on recall are (1) simple recall, and (2) 

™Mpletion, 

l iott, 
s c ишы ап TE TALIS Ross, C. C., Measurement in Today's 
о, um = complete York: Prentice-Hall, Inc., 1947, 


“Reliability of Grading Work in Mathe- 


Scho 


44 PROBLEMS OF MEASUREMENT 


Simple Recall 


One of the oldest methods of attempting to objectify the ey cael 
to tests and examination is that of simple recall. This procedure : 1 се 
from the usual essay type of question by limiting the answer a ed 
word or one phrase. Indeed, most complicated questions icr 
explanation or discussion may be broken down into several ques п 
with short answers. Care must be taken to phrase the чараа " 
such a way that the answer is definite and short—a single ees Re 
possible. Of course, such brevity and precision require that the sub) 
matter be of a definite nature also. ur 

The blanks for the answers, long enough for legible writing 1n " 
cases, should be placed in a vertical column to the right of the gea 
It is most important that all the acceptable answers should be lis A 
on the scoring sheet. For testing understanding rather than ГО : 
memory, questions and statements should be expressed in languag 


. ictate$ 
different from that used in the textbook. In general, usage dictat 
that one point be given for each item correct. 


Illustrations 


The items of the test ma: 


y be expressed (1) in the form of a questio? 
(2) in the form of a state 


ment, or (3) in the form of a stimulus word- 


1. Questions 


а. In the expression “Dear Sir? 


at the beginning of a letter what punc- 
tuation follows “Sir”? 


What is the scientific name for the splitting which occurs in the 
uranium atom in the formation of atomic energy? b. 
c. What is the name of the port on the Adriatic that is claimed by both 
Italy and Yugoslavia? 


С. —— 
- If M.A. is divided by C.A. what is the quotient usually called in 
psychology? 


- What is the logarithm of 100 to the base 10? 
What is the grammatical name of a verb when used às a noun? 


W 


2. Statements 


а. Name the outstanding characteristic of the paintings of George 
Inness. 


a. 
b. Name the president under whose direction the Louisiana Purchase 
was made. 


c. Write the future tense, first person Singular of aller, 


d & e. State the names of two men responsible for the theorem on which 
Thorndike constructed his first Scientific educational scale. 


CONSTRUCTING ACHIEVEMENT TESTS 45 


f. The sum of two numbers is 14. The difference between the squares of 


the two numbers is 28. Find the larger number. — f 
g. Give the number of the amendment to the constitution of the U.S. 
Which was responsible for national prohibition. de 


3. Word-or-phrase Form 


& 


- Scientists 
Name 
- Urey 
‚ Arthur Compton 
- Einstein 
- Priestly 
- Thomas Hunt Morgan 


Most Famous Contribution 


be Be БУ ps 
| 
| 
| 
| 


- English to French 

- Chair | "Em 

- Glass A 

‚ Go (present tense, third 
Person singular) 

4. Hat 

3. Desk 


сә К — с «ль оо гю юк 


ne 
| 
| 


th advantages and disadvantages as 


The simple-recall type has bo l 
r techniques. 


Compared with other short-answe 


Advantages 
In the true-false, multiple-choice, etc., types the correct mener 
demands only the recognition of the right answer. This кол eg 
the unreliability involved in guessing, as 1n the sane ee ori is 
nique wherein the process of eliminating the answers t et е а y 
Wrong leaves the choice to be made from two items rather than E 
five as was intended. Now simple recall, since there is no recognition 
to be made but only recall, reduces the process of poem toa m 
mum. One can be assured that the peli IT he usual 

г і ontrolled. і 
чав Е е ае process toward a definite goal and 
m is one frequently used and 


i . The for 
Баалай ана mes Finally, it is fairly economical of space, 


ence is ili he subject. i с a 
is easy dme dg and АС a wide sampling of subject matter т a 


comparatively short time. 


Disadvantages 


46 PROBLEMS OF MEASUREMENT 


hence are difficult to score. The more precise the form of the question, 
the less does this difficulty appear. In these days of scoring done by 
automatic machines this interpretative aspect of the answer is а dis- 
tinct drawback. It is not, however, a drawback to the ordinary teacher 
trying honestly to evaluate the progress his students are making in the 
area of instruction. Probably the greatest disadvantages of this type 
arises in the difficulty of making up items which call for the higher 
types of mental processes. Naming, citing, giving the author, or hi$ 
works, are closely related to rote memory. In mathematics and science; 
it is easy to overcome this difficulty, a fact easily demonstrated by 
noting that problems in arithmetic and algebra fall naturally into this 
form. 
Sentence Completion 


Most of the characteristics listed under “simple recall" also belong 
under "sentence completion." Sentences, from which certain words 
are deleted, should be definite and clear and should be composed 
anew and not simply copied from the book. Then, too, all possible 
answers for each blank should be carefully listed so that the scoring 
can be objective. Most difficult of all in constructing the sentence- 
completion test is to achieve that nicety of balance between supplying 
just enough material to make the solution possible and giving so much 
information that the answer can be guessed at. Blanks may occur at 
any place in the sentence and should possess three characteristics: 
(1) they all should be of the same length, (2) all should be numbered: 
and (3) the correspondingly numbered blanks should be placed in 2 
vertical column to the right or left of the sentences. Blanks should 
almost always call for but one word. In general the larger the number of 


blanks the more difficult is the item. If the blank is placed at the 
end of the sentence the completion sentence becomes a simple recall. 
Here are a few examples: 


Heredity is the (1) 


relation between (2) generations, 
The colored part of the eye, called the iris, (3) ог (4) 
amount of light increases or (5) 


as the 


The chromosome is an (1) оғ (2) threadlike (3) . 


The coefficient of correlation represents the (1) degreeof (2) 
isting between (3) traits in the same (4) 
individual being measured (5) 


ex- 


of individuals, each 


[n 


CONSTRUCTING ACHIEVEMENT TESTS 47 


The completion sentence probably finds its greatest usefulness in 
testing the development of a rather complex idea in a whole paragraph 
Used in this manner it approaches very closely an instrument for vali- 
dating the higher thought processes. 

| Its advantages and disadvantages are nearly the same as those of the 
simple-recall type. 


SHORT-ANSWER TESTS BASED ON RECOGNITION 


Four types of short-answer tests based on the capacity of the in- 
dividual to recognize the correct answer among several presented will 
be described. They are (1) multiple-choice tests, (2) true-false tests, 
(3) matching tests, and (4) tests of the higher mental processes. 


Multiple Choice 


In the type of short-answer test known as multiple choice the right 
answer to a question appears among à number (usually two to four) of 
Wrong ones. Unless these wrong ones are as plausible as the correct one, 
the purpose of the test is defeated. The answers that are not plausible 
are immediately eliminated from the test, and the subject can then 


make his choice between those left. Plausible wrong answers, called 
d by giving the item as a short-answer 
upils themselves will write the wrong 
answers which may then be used as the wrong alternatives. Illustrations 
of this occurred in preparing tests for the selection of good navigators 
in the Army Air Force. The item was first given in the incomplete form. 
he errors were then compiled and ee шо ss es 5 

: iple-choice test. Sometimes distractors 

Used as alternates in the multip wrong answer. If these dis- 


аге used which lead the ignorant to the w 
tractors have a logical or bookish connotation they work better. Two 


illustrations from Inglis Tests of English Vocabulary! illustrate the 


Use of distractors: 


distractors, are sometimes secure 
test in a preliminary tryout. P 


A Tegular hexagon. (1) six-sided figure. (2) old witch. Godes ааа з) 


assembly, 
» is associated with the word "hex," to 


Answer No, 2, “old witch, 


witch. A second illustration: 
onflict. (3) kindness. (4) fraud. (5) law- 


Ttis a result of collusion. (1) bumping. (2) с 


€ssness, 
oo ЕД 3 t 
n for “collision, and hence *bumping " 


Collusion » might be mistake 
Ould be a distractor. 


1 Boston: Ginn & Company. By permission. 


48 PROBLEMS OF MEASUREMENT 


Another factor that adds to the plausibility of the alternates in the 
item is their homogeneity. The more like each other the alternates 
are, the harder they are to distinguish and the finer is the discrimination 
required. An illustration from the Columbia Research Bureau American 
History Test! shows exactly what is meant: 


Americanization is the process of— 
(1) Keeping foreigners out of America (2) extending American trade bY 


means of subsidies (3) teaching American ideals to foreigners (4) becoming 
naturalized (5) protecting American industries. ( 


All answers are plausible and hard to distinguish unless one knows the 
answer. 

This illustration about “ Americanization " also shows (1) that there 
is no punctuation except the period at the end of the statement, (2) that 
parallel construction is maintained in all the items, and (3) that the 
multiple-choice technique is better than that of simple recall when the 
answer to a question might be long and complex, or when the answer 
might be given in one or two different ways. Moreover the three illus- 
trations use the statement form rather than the question form. Had 
this latter form been used the item would have read “What is the 
process of Americanization?" and the alternates might then have been 
introduced by “It consists of—." Some test makers prefer this direct 
question form because, they say, “it is easier to construct; it is in а 
form with which the pupil has had experience and is less likely to 
contain cues to the correct answer." These preferences seem to be 
largely matters of opinion except in the naturalness of the question 
form to children in school. 

The conditions under which the multiple-choice form is to be pre 
ferred to that of simple recall were indicated. Under certain conditions: 
on the other hand, the simple-recall form is to be preferred to that 0 
multiple choice. When the answer is a number or symbol the simple 
recall form is best. Sometimes, too, try as one may he cannot find mor? 
than two plausible choices for a certain item. Under these conditions: 
also, the simple recall is better. Arrangement and punctuation of item? 
need some consideration. In arranging items of the multiple-choic 
form, care must be exercised not to use a regular cycle of answer? 
Each of the positions (1,2,3,4,5) should be used about the same numbe 
of times, but there should be no logical arrangement of items. Use ne 
punctuation between choices but simply skip three spaces. Place the 
proper punctuation at the end. Place parentheses around the number 4 
which are just in front of the possible answers if the answers are number 
otherwise not. Here is an example: 


! Yonkers, N.V.: World Book Company. By permission. 


CONSTRUCTING ACHIEVEMENT TESTS 49 


One should send a check by: 
l first class mail 2 express $ pared post. 


But note parentheses in the following example: 
Banks usually pay from (1) 1% to 2% (2) 2% to 5%. 


The reader will notice that this principle has been violated in some of the 
previously quoted tests. 


Advantages 

The multiple-choice form is the most flexible of all the forms of short- 
answer tests. Its alternates may be so near together in meaning that 
it takes much keenness of discrimination to distinguish between them, 
Ог again they may test simply information acquired by rote. They 
need not be corrected for chance if there are more than two alternates. 
They are to be preferred to simple recall in complicated ambiguous 
Problems of some length. The reliability of this form is high. 


Criticisms 

Multiple-choice items are difficult to construct. It takes as much 
time to construct one good multiple-choice item as to construct three 
to four simple-recall or true-false items, and they occupy as much space 
9n the page. Plausible alternatives are hard to find, It also takes more 
of the pupils’ time to answer multiple-choice items than to answer 
true-false items. A great impetus has been given the use of the multiple- 
Choice form since the advent of the IBM scoring machine. This 
machine scores a whole test accurately provided the answers = pus 
M certain defined positions. The multiple-choice form with its five 


Positions lends itself admirably to machine scoring. 
True or False 


In the constant-alternatives form of the short-answer questions 


the ili :udgment about the statement as a whole. 
nil isaskedito ie uin he judgment rendered is whether the 


the у jority of cases t j 
кекен а fin and hence this form is most commonly known 
25 the true-false form. It can be used in almost any field of learning and 

0 evaluate the materials as well as the mental processes involved. A 
Samples follow: 
e U.S. safeguards continuity of the Senate by 


Тор uu 
1, The constitution of а hird of the Senators shall соте up for elec- 


specifying that only one-ti 
tion during one year. u 
2. The Ыыы River is usually regarded as the Great Divide. 
Rinsland Henry D Constructing Tests and Grading, p. 47. New York: Prentice- 
› - 
37. 


Нар Ine., 19 


Tiori 


50 PROBLEMS OF MEASUREMENT 


TorF 3. Two triangles are equal if a side and an angle of one equal the cor 
responding side and angle in another. . . -— 
TorF 4. Two root words entering into the conjugating of the Frenc 
aller are vado and eo. 


When properly constructed, this short-answer form can be шше 
sample a very large number of items in a short time. It is compara a few 
easy to construct and to score, although its scoring is subject to 
more errors than is the case with other forms. 


Suggestions for Improving the Construction of True or False Items 


First of all, the statements should not be lifted bodily from a be 
book with perhaps a slight change in the wording to make some of t^ н 
false. It is much better to have the idea embedded in a fresh m le 
words. The language of the items should be within the comprehen, 
of those taking the test. Moreover, the statement should be с ing 
and unambiguous, not clouded in meaning by too many quality 
clauses or double negatives. Whenever possible, quantitative stateme E 
are better than “more” or “less” or other indicators of comparison 
For example, a statement such as 


я А jan 
TorF Foster children adopted into good homes increase their scores OF 
intelligence test. 


may be improved by changing it to 


6 
TorF Foster children adopted into good homes increase their I.Q. scores 
points on the average. 


Certain determiners of a statement's truth or falsity are to be avoided 
Sentences using such determiners as “totally,” “entirely,” “СО, 
pletely,” “solely,” “absolutely,” “always,” “never,” “only,” "alone, 
and such other words which imply universals are usually false. e 
the contrary, sentences using "should," “may,” “most,” uon 

"often," are more than apt to be true ones. By actual count of WO „ 


s ат 
it has been shown that long sentences (with more than 20 words) ^s 
likely to be true. These are the principal suggestions, although б 
are some lesser ones. 


0; 
r 

ay appear as Т or Е, yes or no, T : of 
or by a slight addition, T or F or ?. This last one, “true, false ye 
question" allows some leeway in testing those principles which, pt 


i ing ©“ ; vari? 
sometimes answer by saying “That depends." Still another 
permits further shades of belief: 


T, f, Ut, Uf 


The constant alternatives m: 


ura 
The mean derived from grouped data is dependent for its acc 
upon the distribution of the items within the interval. 


CONSTRUCTING ACHIEVEMENT TESTS 51 


To such an item an individual may respond T = true, f = false, Ut = 
usually true, or Uf — usually false. In this form it approaches that of 
multiple choice. Still another modification is introduced by {йена 
the pupil to correct with one word each false statement but to d 


nothing further to the true statements. 


T or (f) Montgomery was commander-in-chief of the 1 (Eisenhower) 
allied armies. 
(T) or f One of the greatest landings of men in world D ) 


history took place on the Normandy beaches. 


This procedure of correcting the false statements is better liked by 
pupils and students. 

Finally, one other suggestion concerning the arrangement of true 
or false sentences is in order: the sequence must not be a logical one. 
Such a sequence, for example, as T,f,f,T followed by T,f,f,T would 
soon be detected and the sentences thereafter marked correctly by 
reason of the student's ingenuity in discovering the logic of their 
arrangement and not because of his understanding or information. 
To avoid any semblance of logical arrangement one may be governed 
by the toss of a coin. For example, toss a coin and keep the record of 
eads and tails. Whenever a head falls, make the statement true; 


Whenever a tail falls, make it false. 


Correcting True or False Statements 
It is easy to see that when there are only two alternatives a pupil 
id a 50-50 chance of getting an item correct. To avoid the influence 
Chance in the score, many students have recommended that the 


5 H 
Core be obtained from either of the following formulas: 
. S = R — W. Score equals the number right minus the number 


Wrong. Thus, if there were 100 items, of which 50 were correct and 


5 
i Wrong, the score would be exa 
either direction. In scoring a true-f 


“total” (as indicated by the 
s the number omitted, minus 


nu 3 P 
mber omitted up to the last item a 


ad not tri 
Su ried at all). The sym 
PPose there were a test of 


52 PROBLEMS OF MEASUREMENT 


tried put the 25th, 28th, and 34th had been omitted. Four items меге 
wrong. Under these conditions, using Formula 1, S = R — W, the 
score is 28 — 4, or 24. If we use Formula 2, 5 = T — О — 2N, we get 
S = 35 — 3 — 2(4), or 24. Formula 2 is the more practical formula. 
since for the most part a score can be secured by multiplying the 
number of wrongs by two and subtracting this number from the tota 
attempted. 

This whole matter of correcting for chance has come under minute 
scrutiny. Such formulas are based on a very large number of draws 0 
samples. If there were 2,000 items these formulas would be quite sat! 
factory, but with 100 items chance sometimes acts very queerly. ho 
has not drawn good hand after good hand in an evening of bridge while 
at other times the opposite is true? Affecting directly the scoring а 
the directions given. Shall we say to the subjects, “You have plenty E 
time and I want you to answer all items. If you aren't sure, 60655, 
or shall we say to them, “Mark only those items you are certain © 
Do not mark those of which you are not certain"? The second set 2 
instructions on its face seems the most sensible. But individuals dife 
so greatly in their carrying out of these instructions. The quiet precis? 
individual may not try more than 25 out of 40, all of which wil be 
right. Another more venturesome lad will try 35 and make three 
mistakes. The former student receives a score of 25; the second, ^^ 
Under these conditions the formula correcting for chance would nece™ 
sarily be applied although the awareness of its limitations in corre’ в 
ing chance in а small number of items would be apparent to all. TH 
instructions to the subject to answer all the items seems about as od 
as any other. If the subject disliked guessing very much he could dis 
regard the instructions. If students try all the items when they ® j 
instructed to do so the correction for chance errors may be omit Е 
since the uncorrected score correlates perfectly with the corrected ^ 

To score many columns of T's or F's is very fatiguing to the еу 
A key made of stiff cardboard with perforated holes through which 
the true scores may be seen at a glance is a very helpful device. 


Matching 


In constructing a matching exercise two procedures may be follow? 
In the first one, called completion matching, an essential word or phr й, 
is omitted within each sentence of a list of sentences. At the end gt 
these sentences is a list of words or phrases which contains the ‘pe 
answer for each of the omitted words. This form differs fro? -m 
sentence-completion form in that in the completion-matching г 
there may be 10 of 12 answers in a column from which the correct © 
pletion to, say. 8 or 10 word omissions may be made. If the sent 


m 
cC 


CONSTRUCTING ACHIEVEMENT TESTS 53 


completion form had been used there would have been four or five 
possible answers for each sentence, or 40 to 50 altogether. From 


Rinsland! appears this sample: 


Part I Part II 

1. ( ) The number to be multiplied is the ). 1. difference 

2. ( ) The result of addition is called the ( ). 2. dividend 

3. ( ) The number to be divided is the ( ). 3. divisor 

4. ( ) The result of subtraction is called ( ). 4. minuend 

5. ( ) The result of multiplication is called ( Jy 5. multiplicand 
6. multiplier 
7. product 
8. sum 


In the second type, called column matching, two columns of state- 
ments are placed side by side and then the numbers of one column are 
matched with the numbers or letters of the other. An example from 


Benetics follows: 


efines, illustrates or in some other way belongs 


One of the statements in Part I d : 
n Part II in front of the 


With the items in Part II. Place the correct number from 
*Ppropriate letter in Part I. 

Part I Part II 
Са, Occurs in all individuals during the first genera- 1. acquired character 


tion | | Р 

C ўъ. Only one of two alternative characters resides in 2. congenital 
the germ cell. Р А 

? follows the defective 3. dominant 


Cle Red-green colorblindness 


X-chromosome. М . 
| ) d. Transmitted through the germ cells. 4. inherited 
de. Appears in the ratio of 1 to 3 in the second gen- 5. instinct 
eration. P 
C)L The acquisition of a disease from the mother 6. purity of gametes 
А 7. recessive 
duri гопіс state. à 
Wiig /thesemibnyons 8. sex-linked 


may be observed in these 


Som isti f tchin 
e teristics of ma g 5 
Өр Whe uharan us ones have to do with form. There 


wo ill i obvio 
h ustrations. The more II than are needed in Part I. This 


оша } in Part 
4 be ers in Far 
Ted REO AEN inimum. Only one answer must be 


x 
с ау or logically. Great сате 
Sin Part I or Part II w 


lp: | 
tie, апа, Henry D., Constructing Tests and Grad 
all, Inc., 1937. By permission. 


ing, p. 104. New York: Pren- 


54 PROBLEMS OF MEASUREMENT 


singular and plural forms of words. Sometimes the connection is are 
gested througn identity of singular subject and singular verb or " 
versa. Generally speaking, 7 to 10 items in Part I and 10 to 12 in Part F 
would be about as many as would be practicable. It is clear also tha 
all the items of Part I and of Part II should be on the same page- i 
Less obvious than the just-mentioned characteristics is that ° 
homogeneity. All the items in Part I should be like each other, t-87 
homogeneous. The elimination of guessing may be greatly facilitate 
by the homogeneity of the items. All items of Part I of the first illus 
tration could be subsumed under the four fundamental arithmet!© 
operations; while the items of Part I in the second illustration can b* 
placed under inheritance. The more homogeneous the items the mol? 
difficult to guess the answer correctly. Hence the dictum: if you wish to 
make the items more difficult make them more like each other. 1 
There are a large variety of relations to which matching is applicable 
cause and effect, dates and events, authors and their writings, diagram? 
and charts, principles and their illustrations, inventions and inventor? 
angles and their names, tools and their uses, names of compounds а? 
their chemical formulas, and many others. 


Advantages 


Many questions can be answered in a short space because the same 
set of answers can be used for a large number of items. Guessing ! 
reduced under the usual method of construction but may be reduc 
toa minimum by having several items use the same answers. Its greate? 
usefulness comes in answering questions who, when, what, and wher: 
Whether or not it tests the more complicated mental processes depe” 5 
upon its construction. By matching principles and their illustrati?” 
the subject is called upon to discriminate, compare, and conclu 
Such a procedure calls for the same sort of mental processes which н 
demanded when an individual is asked to give an original illustrato, 
of a principle he has learned. This type of short-answer test is capa? 


E : е еї 
of making а rapid survey of а particular phase of a subject-matt 
area. 


Disadvantages 


e 

Matching tests are difficult to construct. It is so easy to leave un B. 
the Jorge variety of specifics which need to be heeded in construct д 
^ — (]jues that one had never suspected and more than one d 50 
4. apt to appear most unexpectedly. Furthermore, it fi? ой 
Жз pa ms such as events and their dates that more complic? ce 
wee э song apt to be neglected. Small units of subject 7? 
posa “sh that homogeneity demanded of a good matching 


them- 


ебі 


Pu 
е fel 


CONSTRUCTING ACHIEVEMENT TESTS 55 


and hence a small unit of instruction is difficult to test adequately by 
using this form. 


SHORT-ANSWER TEsTs: HIGHER MENTAL PROCESSES 


Thus far in our discussion of the construction of short-answer tests 
no special emphasis has been placed on testing the higher mental 
Processes, It is believed, however, that such processes may be brought 
Into play in answering true-false, completion, simple-recall, multiple- 
Choice, or matching questions. It is the purpose of this section to call 
attention to the possibilities of evaluating the capacities of individuals 
(1) to interpret new data which are presented, and (2) to apply prin- 
ciples learned to new situations. One might even like to measure the 
understanding of the nature of proof itself, but thus far such small 
Progress has been made in perfecting instruments for that undertaking 
that this topic is omitted in the present discussion. 

Ideally, it would be best to check the whole process of observing, 
Buessing, formulating hypotheses, gathering data, and finally making 
inferences and other interpretations from the data gathered. So long 
18 this process and so few are they who are called upon to carry it 
through that no objective criteria have been formulated which can be 
applied at all stages of the total process. It, however, has been found 
Practicable to set up procedures by which the capacity of an individual 

9 interpret data already collected by others can be evaluated both 
25 to the type of conclusion reached and as to the manner in which the 
Judgment was achieved. For a discussion and illustration of an attempt 

? analyze and measure clear thinking, see the discussion on pages 18 


were also made to develop tests for prin- 
for the nature of proof. The reliabilities 
the neighborhood of .90 as calculated 
la. In general, this formula gives a 


lightly lower coefficient than other procedures for computing relia- 

lity, As a whole these procedures for testing specifically the higher 

Qul Processes are still in the experimental stage. 'The importance : 

Clear think i tation in this area extremely wort 
Я ink zes experimentatio 

While. ing makes exp 


In one volume! attempts 
M. of logical reasoning and. 
all these instruments were in 


a the Kuder-Richardson formu 


ci 


Our treatment of personality inventories. Most of them are 
5 from the types introduced in this chapter. 

Smith, Eugene R., Ralph W. Tyler, et al, Appraising and Recording Student 
bres, pp. 111-124. New York: Harper & Brothers, 1942. 


Variant 
1 


Py 


56 PROBLEMS OF MEASUREMENT 


ORGANIZATION AND ARRANGEMENT OF TESTS 


5 items 
Let us now assume that the objectives have been defined, к” 
for the construction of the test have been accumulated. Let «on иё 
further that the items have been carefully edited and cast into he items 
desirable test form. There still remains the organization of t 
their arrangement. che 
y ins the py under lest forms. Suppose, for example, 6 
course had been а survey of American history covering the Ke ò 
from 1865 to 1900 and that some of the items were true-false, b 
them simple-recall, and some others matching. In the arrangem e 
three or four forms on one topic there would necessarily be à "f 
number of true-false items, a small number of matching items, iret 
small number of simple-recall items. By assembling all true-false я xii 
simple-recall items, etc., into one division of the test the ped 
prevails for a much longer time and the confusion of shifting ™ jl 
sets is avoided. For this reason, it is better to place all the a 
items in one section, all the matchings in another, and all the simple? 
items in still another. ange 
Arrange the items from easy to hard. In general it is better to атт g of 
items roughly from the easy to the more difficult. An exact grading , 


E А : Ё е å : : y ра 
items according to difficulty is manifestly impossible until they ssi?” 
been tried out with a number of subjects and the percentage ра? pe 
each item calculated. Th 


е teacher from his acquaintance with Ave 
class and the difficulty of the items can arrange them into four O" frst 
groups of increasing difficulty. If the difficult items are placed js n? 
the subjects may spend so much time on the first items that there "dis 
time left for the easy ones which come later or else may get 80 as 
couraged because they seemingly cannot answer enough items wT pt 
the test that they give up completely. The easy-to-hard arrange not? 
gives the subject confidence, and if he takes too much time on the "rye 
difficult items he has at least finished the major part of the test opt 


items should range in difficulty from those almost all the class get t if 
to items difficult enough so 


that very few get them right. It is item 
the average of the class lies somewhat near half the number of ! mu 
This idea should be kept in mind by the test constructor, but а p the 
lying between 35 and 65 in 100 items would not greatly distur 
efficiency of the test. 


Arrange items so that their answers cannot be guessed or me ш, 
logically. Suggestions have already been made how this is done та p 
false, matching, and multiple-choice tests. Some sort of chance osi jo? 
ment is best. In the multiple-choice form see that each of the T at | 
(1,2,3,4,5) has the correct answer about the same number of t 


CONSTRUCTING ACHIEVEMENT TESTS 57 


The tester must be provided with enough extra pencils so that there will 
be no delay in the progress of the examination. He must read over the 
instructions aloud with the children, answer their legitimate questions, 
and make sure that they understand exactly what they are to do before 
they begin. Children should be practiced in a preliminary way before 
they take these short-answer types of test. The tester must see that the 
children are not disturbed and that the stop signals are properly given. 
vertical columns on either the right or left so that the 
responses can be easily scored. These empty dotted lines must be long 
€nough in completion and simple recall to write the answers. Little 
children especially have a tendency to write larger than do adults. 
Most authors recommend that these columns be placed to the left of 


the numbered item in true-false and matching tests and to the right in 
multiple-choice, completion, and short-answer tests. 
If the instructions concerning the 


Score tests by using a prepared hey. 
Correct placement of the answers are carried out, one may then make a 
800d scoring sheet by writing in the correct answers with a red pencil. 
lace these filled-in sheets right by the answers of the subjects and 
Checking may go on at a rapid pace. If there is a large class it may pay 
the grader to paste the answers on a cardboard strip which holds its 
Position without bending too much. True-false corrections are apt to 
laced just to the left of each item, 


Cause some trouble. If “T or f" is р i 
then the correct items may be punched out of a cardboard with a 
Circular punch. If this is then placed over the score column the correct 


items may be seen through the holes. | 
Give dn delailed instructions. Generally speaking, blanks will 

* left on the outside of the paper for the date, the name, and the grade 
and for both part scores and the total score. Explain to the subjects 
exactly what à to be done in each case. Explicit instructions must also 
© given to the children concerning the following of directions, whether 
d whether they may use any leftover 


Arrange numbered 


Explain i flect of guessing in 
re to m i Е : ers, then they should be told 
be: hes caret he children be asked to go 
he wa Te р d mark those they know, 
then 8o beh fs items а зе ss at the rest of the 
™s. In this case the items nee 
IMPROVING THE ESSAY TYPE OF EXAMINATION 
Ine amination the first consideration is its 
val T of examina 2 Г я 
eflec sem ын E std of the objectives decided upon at the 
“Binning of Te анн Unless the objectives aimed at in the course are 


all + 


58 PROBLEMS OF MEASUREMENT 


tested by the type of examination used, the examination is € 
useless. To be more specific, essay questions frequently ask ше vie n 
to “compare,” "contrast," or “discuss.” The adequate ма ed 
such questions depends on the manner in which the course pem 
taught. All along throughout the course the student must have pr nd а 
in comparing, contrasting, and discussing. He must know beyo side 
doubt that “compare” means to set down facts or lines of evidences а 
by side and from their contemplation come to a reasoned = a 
And thus it is with “contrast” and "discuss." It is impossible 5 
students to make such contrasts, comparisons, etc., unless they ney 
been trained in the forms and procedures used to arrive at reaso! 1 
conclusions. When such conditions have been met the essay tyP€ E 
examination gains in altitude because it requires the students to P 
form complex mental processes involving comparison and inference: , 
It is very probable that the higher mental processes involve say 
comparing, contrasting, and discussing can be appraised by the me 
question just as accurately and with greater economy than Бу 
objective types of testing described on page 20. Whereas two or t 06б 
pages with a variety of possible inferences are needed to const! , 
an objective test only a short sentence asking for the precise co { 
parisons and inferences may be all that is needed for the essay tyP® 
examination. ow 
For these reasons it is of the first importance for the teacher to P. 
how (1) to construct effective essay-type questions, and (2) to 5€ 


rou 
them more precisely so that the reliability of the examination based 
them will be adequate. 


VALUE OF THE ESSAY-TYpE QUESTION y 
it iy ue Р а, 
The limitations and undoubted weaknesses of the ordinary 655 


type questions and of the examinations composed of them have 4 renee 
been described on pages 41 to 43. In spite of these criticisms t us 
were those who felt sure that this type of question was valuable pech "m 
it assayed many of the higher mental processes involved in the org2” 
tion and evaluation of experience. It was and has always been the 9 at 
medium used in writing compositions and preparing articles in Jie? 
nalism courses. It has, on the other hand, been woefully misused Y * 4 
it inquired for details of info 


2 rmation which could have been Ww 
much more effectively by the short- 


_ answer tests such as the true- 
short-answer, multiple-choice, etc. 


А historian, A. C. Krey, who is greatly interested in the teaching. 
well as the testing of the ou 


tcomes of social Science, writes as follo ай 
1 Kelley, Truman L., and А. С. Krey, 


Tests and Measurements in thé 
Sciences, p. 480. New York: Charles Scribner’s Sons, 1934. By permission. 


9? 
1 


CONSTRUCTING ACHIEVEMENT TESTS 59 


Furthermore, such minute sampling of social science knowledge 
[by means of short-answer tests] clearly did not constitute a test 
of the student's comprehensive knowledge, or of his ability to 
develop sustained exposition of large ideas and to include the con- 
ditional elements which qualify any but the most simple of social 
situations. In other words, the extremely short answer form of the 
test seemed an artificial limitation which must confine such tests 
to the measurement of only the fragmentary beginnings of social 


Science knowledge. 


It is possible through essay questions “to develop sustained exposition 
of large ideas and to include the conditional elements which qualify 
any but the most simple of social situations." When items selected 
from a large number are to be brought to bear in a central topic, when 
they are to be compared and evaluated, and from this procedure an 
inference is to be drawn the essay question is more effective than the 
short-answer type. For these reasons, the essay question has weathered 


the storm of criticism. 


It is the purpose of this discussion to show how (1) the questions 


Сап be so improved as to register more precisely the desired processes, 
and (2) the accuracy of scoring can be greatly increased by deciding 
On the items to be counted before the tests are scored and by instructing 
the students in the essentials of good answers before they begin the 


test, 


ING QUESTIONS OF THE Essay TYPE 
bing and illustrating a rich variety of 
Урез of essay questions, 20 in all, was achieved by the publication of 


Onroe and Carter! in 1923. Ten years later, Weidemann's? 11 different 
types of usable questions were made available. From these two lists 
the author has selected and illustrated 10 different types of questions. 

is important to understand that these are simply illustrations, which 
need to be adapted to the framework of the course which is being 


Conducted. 


Improv. 
Substantial progress in descri 


1. Int ion { 
= ae ер the following lines of evidence bearing on 


the problem of maturation in young children: (1) neurological, 

(2) co-twin control, (3) parallel groups. | 
ыг h E. Carter, The Use of Different Types of Thought 
Questions д ed ae 2j Their Relative Difficulty for Students, Bureau of 


Е 5 i ity of Illinois, 1923. 
Ucational tin No. 14, University о ] Д 
Lo Fo on on Procedures," Phi Delta Kappan 


Examinati 
(1933) 16:78-83. 


1 


60 PROBLEMS OF MEASUREMENT 


b. How do you interpret such evidence as “Not a cough in à 
carload," "Doctors say there is no throat irritation from 
smoking brand x," etc., when used in radio advertising? 

2. Criticism and evaluation 
а. Criticize and evaluate the effect of the Yalta Agreement. 
b. Criticize the notion of "independent unit" in heredity. 
3. Statement of purpose 

a. What was Shakespeare's purpose in introducing the witches 
into Macbeth? 

b. What is the purpose of local government? 

4. “How” questions 

а. How would you set up an experiment to demonstrate the 
influence of air pressure on the lifting power of a pump? 

b. How is it possible for an airplane to rise and remain in the 
air for certain periods of time? 

Cause and effect 

а. What was the effect of the removal of price controls on the 

Cost of ordinary commodities? 

b. What is the effect of high mountains near the coast and 


prevailing winds on the amount of rainfall in the interior? 
6. Statement of relationship 


oe 


a. In what ways is the reliability of a test related to its validity? 
b. What is the relation between rainfall and crop yield? 
7. Comparison and contrast 
а. Compare the actions of Lady Macbeth with those of Macbeth 
when they were contemplating the death of Duncan. 


b. Point out the leading differences between а confederatio" 
and a republic. 


. Illustrations and examples f 
a. Give two illustrations of the influence of the Federal acts " 
"d on Southern political life between 1867 a” 


b. Name three examples of the action of oxygen. 
9. Application of rules or principles 


a. Would a piece of iron 6 feet long be longer or shorter whe? 
heated? Why? 


: a 
b. Would an ordinary pump lift water higher or lower 0? 
mountain than on a plain? Why? 
10. Discussion 


a. Discuss the influence of weather on rocks and soils. yal 
b. Discuss the meaning of a climax in a play as to (1) its gene 
nature, (2) its position in the usual play. 


CONSTRUCTING ACHIEVEMENT TESTS 61 


IMPROVING THE SCORING OF ESSAY-TYPE QUESTIONS 


After the teacher has assured himself that (1) the questions reflect 
the presence of the complex mental processes which are the objectives 
of his course and (2) the questions are carefully and accurately made, 
his next problem is to improve the reliability of scoring such ques- 
tions. Two procedures will be described, both of which require that the 
acceptable answers be set down and considered before the scoring 


begins. 
The Sorting or Rating М ethod 


estigations, results obtained from scoring 
time, and adding up the scores from the 
separate tests were compared with results obtained from rating exami- 
Nations for general merit. Sims! concluded that, of the two, rating for 
Seneral merit was more economical of time and more reliable. His 
Procedure is roughly described by the following imperatives: 

1. After a quick reading, sort the papers into five groups: (a) very 
Superior, (6) superior, (c) average, (d) inferior, and (e) very inferior. 
The number in each pile is somewhat controlled by the percentages 
allocated to each pile. The highest and lowest piles are to receive 10 
Der cent each; the next piles, as we move toward the center, 20 per cent 
cach; and the middle pile 40 per cent. Thus we have about 10, 20, 40, 
20, and 10 per cent in the five piles. There was no inclination to use 


exact percentages. 
2. Do not give separa 
each Paper in its appropr 
3. Reread the papers, t 
Procedure seems warranted. 
4. Give all the papers in th 
and so on until all the papers 
A similar procedure has be 


Percentages approached more nearly t р 
4, 22, and 6 per cent approximately. He advised the raters to “think 


Only of quality in terms of subject matter.” Better results are achieved 
! the names are written on pers where the rater cannot see them. 
The reliabilities achieved by correlating two teachers’ ratings of the 
Same papers ranged from 67 to .79, with an average of .72. 


In Sims's preliminary inv 
Separate questions, one at а 


te grades to individual questions, but place 
iate pile according to its general total merit. 
hen shift a paper to another pile when such a 


e highest pile, A; the second highest, В; 
on the lowest pile receive E. 

en recommended by Rinsland? but the 
hose of the normal curve: 6, 22, 


Reliability, and Validity of an Essay Examina- 
Educational Research (1931) 24:216-223. 


ies 
tton 25 Verner, “The Objectivity, 
2? ,"Taded by Rating,” Journal of 


insland, ор. cit. p. 253. 


62 PROBLEMS OF MEASUREMENT 


The Point-score Method of Scoring Essay- 


In the point-score method an anal 
answers to the questions. It is decide 
In the procedure used in the College 
the readers have gotten together and 
another—have decided upon the numb 
each acceptable item. 
questions have been 


type Questions 

ysis is made of the acceptable 
d what each part shall receive 
Entrance Examination Boaré, 
—after consultation with 006 
er of points to be attributed t? 
' The result is much more exact because the 
carefully constructed and even tried out in ? 


remarkable а; 
reliabilities f 
Board, obtai 
are:? 


Thus the reporte 
College Entrané 
mple of the pape? 


‚ reliability can 


prehensive Examination in English 
No. 13 (1940) 21:107-119, jel)! 
? Stalnaker, John M., “Essay Examination: ; 


(1937) 46:671-672 S Reliably Read,” School and 500 


! Noyes, E. S., “Recent Trends of the Com 
Educational Record, Supplement 


CONSTRUCTING ACHIEVEMENT TESTS 63 


greatly increased. These more exact procedures in scoring have aided 
us to preserve the important characteristics of the essay-type question. 

In conclusion, it is abundantly clear that both short-answer and 
essay-type questions are necessary to measure adequately the objectives 
of the ordinary course. Very clearly have the advantages of the short- 
&nswer been stated by A. C. Krey, a student who is not himself an 
expert in test construction! 


It [the new type of test] is the most efficient device for detecting 
the student's possession of those separate material elements which, 
though not the end of instruction, are an essential preliminary to 
those ends comparable to the shoring which the engineer employs 
in shaping buildings made of concrete. No other testing device 
Covers so great a range of information in so short a time, or can 
be graded so quickly and accurately. It may also be used to discover 
the student's knowledge of the simpler and limited relationship 
of this material. It may be used to some extent, also, in testing 
students’ ability to apply ideas to new materials, and his possession 
of the skills involved in the subject. The more advanced and 
complex stages of these values, however, must as yet be discovered 


У other forms of test. 


While many testers would deny the strictures placed upon the short- 
Answer test for measuring the more complex stages of understanding, 
"y would all agree that the short-answer test covers the largest area 


їп the shortest time. 

When comparisons are to be made, contrasts to be indicated, assump- 
tions stated, materials to be summarized or outlined, and deductions 
made ог inference drawn from a large amount of material, the essay 
‘Ye of test is more efficient and should be used. 

SUMMARY 
Short-a. > attempt to evaluate more precisely and 
-ar Д ts are ап А | 
More СОЙ. wm results of instruction. T hey depend for their use- 
Ness pan TR exact definition of the objectives of instruction. 
eir main strength lies in two major areas. In the first place, they are 
Tong be n P can sample in the time available a much larger 
саана Зно the essay type of examination. 


Dum han can 

er comes tha 

the brem a carefully constructed, they can measure more 
, 


Teliab] They are weakest in the evaluation of the 
д ез. у З 
ig lor simia as judgment and reasoning. These short- 
1 easurements in the Social 
K Krey, Tests and M: s і осі 
Science’ , om =. ae St ae Sons, 1934. By permission. 

» Р”. 482. New : Cha 


ЗЕ 


64 PRÓBLEMS OF MEASUREMENT 


on 

answer instruments may be divided into two classes; those based 
recall and those based on recognition. the 

Two types of short-answer tests based on recall are өр" ў 
simple-recall type and the completion type. In simple recall ite ques” 
unit of instruction is broken down into smaller ones and defini a 
tions are asked that can be answered in a word or in a phrase omi 
care must be taken to phrase the question in such a restrictive € 
that only one answer will be possible. Acceptable разре Т eid 
listed for each item before the scoring begins. In the comp к." those 
key words are omitted which presumably can be supplied only sd ty 
who are steeped in the material being tested. Considerable ing dh ol 
is needed on the part of the test constructor to provide just ерон aive 
the sentence to make the thought intelligible and not enough i nave 
away the answer. Each blank should be of the same length an 
in it a number in parentheses. These numbered omissions AS С 
easy scoring, a vertical column of blanks whose numbers соггеѕр 
to the blanks in the body of the test. сїрї 

The second type of short-answer questions is based on the prin ge 
of recognition. Several answers are supplied, and the subject to chis 
the answer correct must check the right answer. Four forms a in£ 
type of question were set forth: multiple-choice, true-false, uet e 
and tests of higher mental processes. Of these, matching and mu usi 
choice are most alike. They depend for their efficiency upon the pe 
bility of all answers and the homogeneity of the answers themse si 
The multiple choice has the greatest all-round usefulness. In p от 
about four on five plausible choices are used for each question; The 
among which the subject tries to choose the correct answer. C 
matching technique is more compact, since only one list of answer” pis 
necessary for a number of questions. The number of answers 21 
list may not be more than two more than the number of items t° s 


ov 
matched. The sets of answers or matches should be homoge" m 
among themselves. The true-false form is easy to construct ар ice? 
score. It is handicapped as а form because there are only two € 106 


H . rec 
etting an item correct. Corr?" yat 


imed at testing the higher i алб 
rceiving relationships in d? cce. 
ns of data have been most oA сой, 
nciples of reaching the correc? (рё 
g the right principle on qu 


ful. Such tests are based on the pri 
clusion from data and of checkin 
correct interpretation depended. 


19€ 
E А ero 
The essay type of question may stimulate the student to ё . 


CONSTRUCTING ACHIEVEMENT TESTS 65 


his higher mental processes, to state conditions on which an assumption 
rests, and to develop a sustained exposition of large numbers of ideas. 
Furthermore, it permits the outlining and summarizing of great areas 
of information. Because of these strong points the essay type of question 
needs to be improved. Such improvement may come in the question 
and in its scoring. The questions may be improved by incorporating 
in the question the lines of reasoning which are to be developed. 
Improvement in scoring accrues from deciding upon the answers to 
questions before the scoring begins, followed by either (1) a rating 


of the examinations as a whol 


e and dividing them into appropriate 


piles, or (2) defining rigidly the points to be scored and then summating 


the points. 


1. Makea point-by-point comparison 
of the recall type of short-answer tests 
With the recognition type. Which type 
Seems to you to furnish the better 
evaluation? 

2. Select an area of information with 
Which you are very familiar. Construct 
20 true-false items, 20 multiple-choice 
Items, 10 completion items, and 20 
Short-answer items. Follow closely the 
Principles laid down for the construction 
of each type. Which type measures 
better the higher mental processes 
Involved? 

3. How should the wrong items be 
treated in scoring the true-false type? 
The multiple-choice type? Can you 
write а general formula for correcting 
for chance which will apply to all cases 
involving guessing? Explain the princi- 
ple involved, 


QUESTIONS AND EXERCISES 


4. Describe the procedures used in 
evaluating the use of the higher mental 
processes. Do you think this process of 
reasoning should be a defined outcome of 
our education? Why? 

5. What advantages might accrue 
{гот the emphasis upon evaluation in 
the learning process? 

6. What are the leading difficulties in 
measurement of outcomes of education 
arising out of the use of the essay type of 
test? What essential outcomes are 
tested by the essay type of examination 
which are very difficult to test by the 
short-answer type? 

7. Describe the procedures used for 
improving the construction and scoring 
of essay-type tests. 

8. Distinguish sharply between the 
situations suitable for (а) the short- 
answer test, and (b) the essay-type test. 


BIBLIO GRAPHY 


Скомвлси, L. J.: “Ап Experimental 
Comparison of the Multiple True-False 
P5 Multiple Choice Tests," Journal of 
ш Psychology (1941) 32:533- 


Hawkes, Herpert E. E. Е. LIND- 
ien and C. R. Mann: The Construc- 
uo and Use of Achicvement Examina- 
qe Part II, pp. 163-442. Boston: 

Ughton Mifflin Company, 1936. 
ELLEv, T. L, and А. C. KREY: 


Test. А 
5 and Measurements in the Social 


Sciences. New York: Charles Scribner’s 
Sons, 1934. 

Micueets, W. J, and M. Ray 
KARNES: Measuring Educational 
Achievement. New York: McGraw-Hill 
Book Company, Inc., 1950. 

Monroe, WALTER S., and RALPH E. 
Carter: The Use of Different Types of 
Thought Questions in Secondary Schools 
and Their Relative Difficulty for Students, 
Bureau of Educational Research Bulle- 
tin No. 14, University of Illinois, 1932. 


66 PROBLEMS OF 

Noyes, E. S.: “Recent Trends of the 
Comprehensive Examination in Eng- 
lish," Educational Record Supplement 
No. 13 (1940) 21:107-119. 

ORLEANS, JacoB S., and GLENN A. 
SEALY: Objective Tests, Chap. XIII, pp. 
218-242. Yonkers, N.Y.: World Book 
Company, 1928. 

Remmers, Н. H., and N. І. GAGE: 
Educational Measurement and Evalua- 
tion, pp. 146-193. New York: Harper & 
Brothers, 1943. 

RiNSLAND, Н. D.: Constructing Tests 
and Grading. New York: Prentice-Hall, 
Inc., 1937. 

Ross, C. C.: Measurement in Today's 
Schools, 2d ed., pp. 103-171. New York: 
Prentice-Hall, Inc., 1947. 

Косн, б. M.: The Objective or New- 
type Examination, Part II, Chaps. VII- 
X, pp. 149-280. Chicago: Scott, Fores- 
man & Company, 1929. 

Sms, VERNER: “The Objectivity, 


MEASUREMENT 


Reliability, and Validity of an Essay- 
examination Graded by Rating,” Jour- 
nal of Educational Research (1931) 24: 
216-223. Р 
Ѕмти, Е. R., R. W. Tyrer, ef al 
Appraising and Recording Student Prog- 
ress, Part I. New York: Harper & Broth- 
ers, 1942. F 
STALNAKER, Jons M.: “Essay Pu 
aminations Reliability Read," Schoo 
and Society (1937) 46:671-672. 
Travers, Ковккт M. W.: How” 
Make Achievement Tests. New York: The 
Odyssey Press, Inc., 1950. "-— 
Tyrer, R. W.: Constructing Achiet® 
ment Tests. Columbus, Ohio: The Oh? 
State University Press, 1934. -— 
Wememany, C. C.: “Written pr 
amination Procedures," Phi Delta Ка 
pan (1933) 16:78-83. t 
: “Review of Essay Te 
Studies,” Journal of Higher Educatio" 
(1941) 12:41-44. 


CHAPTER 4 


The Testing Program—Achievement-test Batteries 


Let us assume that the objectives of instruction of an elementary 
school have been decided upon and that teacher-made tests have been 
administered and the results studied. But the outcome is not satisfying, 
something seems to be lacking. There is no way of deciding for certain 
whether the pupils are really doing as well as schools in other com- 
munities. Other questions as to whether the pupils are progressing at 
the usual rate also arise. Such a condition furnishes a fruitful oppor- 
tunity for developing a program of testing with standardized tests. 


PLANNING FOR THE TESTING PROGRAM 
st successful, it must have the cooperation 
w malcontents can throw a monkey wrench 
this desirable cooperation the whole 
faculty must be involved. The principal, therefore, must call them 
together and the whole problem of testing must be introduced. It is 
Well in this initial meeting to have someone well versed in testing to 
Present the matter. Suppose now that the faculty votes in favor of 
Such a program. If 50, committees are formed to study the areas where 
testing can be done with the greatest promise of success. After a short 
while the committees make their reports, thresh out their differences, 
and define and agree upon their major needs of testing. 


For a program to be mo 
of the entire staff. Even a fe 
Into the machinery. To ensure 


DEFINING THE PURPOSE 
From the democratic procedures described in the preceding para- 
raph, suppose that the following purposes emerged: 


à m ils in reading for understanding. = — 
2. To е re level of success of each pupil in each of the 


Subjects of the curriculum according to his age, his grade, and his 


ability, ien 
* of each pupil in each subject. 
: uct ce ы school subject each pupil is strongest and 


i б 
З Which, weakest. E 


10 PROBLEMS OF MEASUREMENT 


* 
TABLE 2. PLANNING THE TESTING PROGRAM 


TESTING PROGRAM ORGANIZATION CHART 
Community: Anytown, U.S.A. 


> ; f the 
PURPOSES OF THE PROGRAM: 1. To aid teacher in a better giis Ra 
ability and achievement level of each pupil. 2. To point out subject str 
and weaknesses in each school and in the community. 
GRADES то ВЕ TrsrED (Circle): | m 
Intelligence: 1,2,3,4,5,6,7,8,9 Achievement: 1,2,3,4,5,6,7,8,9 


Other (Give type of test and grades): Metropolitan Readiness—Grade I -— 
Момтн ОЕ TESTING: 


a x ber 
Intelligence: September Achievement: October Other: (Readiness) Octo 
TESTS TO BE Usen (Indicate name of test, battery, and form): 


SENCE ACHIEVEMENT Е 
ететан TA 1 Metropolitan—Prim. I Grade(s) 7— 
Pintner-Durost Grade(s) 3 i —Prim. II Grade(s) —., 
Pintner Intermediate Grade(s) 6 + 84} “ — Elem. Grade(s) 77 

Grade(s) “ —Inter.  Grade(s) т" 
Grade(s) “ —Adv. Grade(s) 8 — 
w—— -— 


OTHER TESTS 
Metropolitan Readiness Grade(s) I Grade(s) — 


DIRECTOR or THE PnocRAM Miss Mary Drake (Elem. Supervisor) 


Exammner(s): Psychologist Principals Teachers X Others МИ 
5совкк (5): Teachers(Individual) Teachers(Group) X Clerks ^ Machines — 
Others LE тек 


CHECK Scorer(s): Teachers(Individual) Teachers(Group) X Clerks, — 
Others ас 
METHOD or Trst DISTRIB 
and distributed at the tea 
REPORTS то ВЕ MADE. 
By THE TEACHER: Profile Chart 


UTION Tests will be 


ы fce 
packaged in the principal’s © 
chers’ meeting. 


x 
_X Class Record X Class Analysis Chart + 


ecord X Other summaries — ÓÓ 
By тне PRINCIPAL: School summary X Other summaries =й 
IRECTOR: Administrative Summary X Other summarics_—~ 
Tzsr RESULTS то ВЕ RECORD ud 


ED IN TERMS or: 

INTELLIGENCE: Ratio IQ Deviation IQ X Mental Age 

ACHIEVEMENT: Standard score (гаје Equiva. (Trad.) Age Equiv. | —7 
Percentiles (Trad.) om 


SCHEDULE OF TEACHER CONFERENCES 


: Before testing (Date) September 20 Я 
Before scoring(Date) October 25 For i 


UR = - any’ 
* From Planning the Testing Program, by permission of World Book ComP 
Yonkers, N.Y. 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 71 


TABLE 2. TESTING PROGRAM ORGANIZATION CHART (Continued) 
TESTING SCHEDULE 


Day (date) Hour Test Gade Adm. time, 
minutes 
Monday 9 Ax |Pintner-Cunningham 1 25 
Pintner-Durost 3 45 
Pintner Intermediate GHE 45 
Monday Pir 
Tuesday 9am | MAT Prim. I—Tests 1,2,3 2 30 (арр.) 
MAT Prim. II—Tests 1 + 2 3 40 (app.) 
MAT Elem.—Tests 1 +2 44-5 35 
MAT Inter.—Tests 1 + 2 6+7 35 
МАТ Adv.—Tests 1 + 2 8 35 
Tuesday 2 рм | MAT Prim. I—Test 4 2 15 (app.) 
MAT Prim. II—Tests 3 + 4 3 30 (app.) 
MAT Elem.—Tests 3 + 4 4+5 65 
MAT Inter.—Tests 34-4 64-7 80 
MAT Adv.—Tests 3 + 4 8 80 
Wednesday | 9am | МАТ Prim. II—Test 5 3 15 (арр) 
MAT Elem.—Tests 5 + 6 4+5 35 (арр.) 
MAT Inter.—Tests 5 + 6 64-7 40 
MAT Adv.—Tests 5 + 6 8 50 
Wednesday 2рм |MAT Inter.—Tests 7,8,9, + 10 6+7 60 
MAT Adv.—Tests 7,8,9, + 10 8 60 
Ee. ] Ёле e eee Et 


the year, "Teachers are more likely to teach the particular items present 
' the test, which spoils the test results, since the items are representa- 
he test results may also be 


tive samples of a much larger number. jÐ 
class the next fall, though in this respect 


used for grouping pupils in d š 
hey are not of the greatest use because of differential forgetting during 
e summer vacation. The author leans toward autumn because he is 

Thost interested in the use of tests for instructional purposes. 
€t us suppose now that the season of testing has been decided upon. 
duling which must be arranged in 


There still remains the class sche 
her will know exactly when the tests are to 


m Organization Chart, Table 2, contains complete 
des to be tested, time, director of the program, 


detail 

; S. It lists the gra т ыр 
1515 of testing schedule. The time for administering 
тее; and the 1 and important detail for complete 


Sach i essentia 
ig mh tes ie hich gives the day, the time of day, 


The Testing Progra 


* names es, an 
of the tests, the grace» z : 
: à ecial interest 1n the present connec- 


& NR 
ministering each test is of esP' 


72 PROBLEMS OF MEASUREMENT 


the 
tion. Some such detailed schedule should be formulated before 
testing is begun. 


ADMINISTERING THE TESTS 


The teacher is the one who must administer the tests. In some Ps 
grams planned for special purposes a member of a trained sta din 
testers may administer the tests. For the ordinary testing prog" tel- 
designed to understand more accurately the achievement and 10 
lectual abilities of pupils so that improvement in instruction may 
facilitated, the teacher gives the tests. "E 

To do this job well the teacher must divorce himself from Ша » 
as teacher and assume a new one, that of tester. To do this we 5 
must become thoroughly acquainted with the tests to be Une. at 
of the best ways to do this is to go through the entire proce pe 
(1) read the instructions, (2) take the test, (3) score the test, * 


я z pat 
(4) interpret it. In the first place know the instructions so well t 


y 2 а ise 
most of the testing time may be used in watching the subjects- ү, 


testers go through the manual and mark in red (1) what has suf 
read aloud, (2) where the instructions begin, and (3) above all, W ut; 
the timing is located. If samples of the test are given to be worke ris 
he must see that they are done properly. The teacher’s job as ue 


ab 
to see that the pupils (1) understand the directions, (2) do not ert 
(3) work continually and faithfull i 


3, { l y, and (4) have a quiet place to * the 
without interruptions. To avoid interruptions place a placard gu 
door of the classroom: 


e de “Testing Going On—Do Not Disturb.” EN 
ming 15 most important, Secure a stop watch, if possible; V, 


jl ke ue а second hand. Write down the time the lest begins am ks еї 
tt ends. ile the tester in no way hints or suggests what the an (еї 


to ап item is, neither is а good tester а “deadpan.” The good t 
encourages children to 


work by asking them to tr 
courages them to do their 


ting of the faculty and go ^ nf 
the test's details point by point as has been suggested in the pre jd 
paragraph. He must emphasize, as must we all, that instructions jed 
directions must be followed exactly 


: a 

y. Unless the instructions are © оше 
with norms and with records 0 
grades cannot be usefully made 


SCORING THE TESTS 
For best results the teachers a 


leader and the details of scoring 


#06 
те once again called together Py get 
carefully reviewed. All stand? 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 73 


tests give detailed instructions for scoring. In some large municipalities 
the pupils write their answers on a separate sheet and the scoring is 
done by an International Business Machine, commonly called IBM. 
But in our plan the teacher scores the papers. It is always an onerous 
task and takes several hours of work. Many devices have been developed 
for Shortening the scoring time, such as window stencils, stiff cardboard 
With lists of answers, and squares in which the right answer enters a 


Cross. 


The experience of the author recommends the following procedure 


for its speed and enjoyment. If there are eight subtests, obtain the 
Services of nine teachers who sit around a large table. Each teacher 
becomes responsible for scoring а single test. Teacher No. 1 scores the 
first test and folds the paper back to Test 2, which the second teacher 
Scores; he then turns to test 3, which teacher No. 3 scores, etc. The 
ninth teacher brings the scores forward to the front of the test, enters 
them in the proper place and adds them up. Once this procedure has 
been started, the scored tests roll off the line in a continuous stream. 
After a little practice each teacher practically memorizes the answers 
for his test and the work moves rapidly. 


For accurate scoring it is necessary for the scoring to be checked by a 
scoring. Samplings of about one test in 


резол not concerned in the first 1 
lve for checking are adequate. Errors most likely to occur are con- 
ting averages ог medians, and 


cerned with correct adding, compu а e | 
Scoring those items which are scored by the right-minus-wrong (Е – W) 


techni 

que. 

p UTILIZING THE RESULTS 

and utilization of results are likely to be the 
But without them the whole 


ain of testing. : 
ut value for improving the processes and 
lem of interpretation hinges on the 


arrangement of scores in such а manner that their meaning is immedi- 
ately apparent. Generally speaking, some sort of derived scores are 
More meaningful than the raw scores secured from the iene, нин 
erived scores are age and grade scores, I.Q.s, and percentiles. 


he ў а interpreting scores will be the record of a 
first illustration of Р o» with slight variations in any com- 


Single cn: 
Pleted €— pee pu 2 is in grade 7.2, or two-tenths of 
€ distance “Ж-а, the seventh grade. By looking at Average eg 
Ment at the bottom of the table, you note that his score is 5. His 
Chievement is somewhat above his grade standing. To interpret this 
Score more accurately we would need his І.О. also. If his І.О. were 100 
i y 125, he wouid not have fully achieved up 


55 would be good; if it were 


INTERPRETING AN 


mt interpretation 
festi est links in the ch 
n Ing program is witho 

aterials of education. The prob 


74 


Fic. 2. Completed 


PROBLEMS OF MEASUREMENT 


=== 


2. Vocabulary 
Average Reading 


1. Language Usage 

IL Punct. and Сар. 
Total = > d 1D 
Ш. Gram: 
Total (Рага) L П, and Ш) 


б. Literature 214 
T. Social Studies: His. |228 


| 
8. Social Studies: Geog. 214 


Average Social Studies 
9. Science 
10. Spelling 


Average Achievement 


3. Arithmetic Fandamentss | / 7B) 5 
I mensis | 178 | 5,4] 


5. English Egat cay eet 


agta seer 
Ave Mv either Pats 
Lasd I or Parts I, I, and TIL 


F 

4. Arithmetic Problems 124| 2.0 |25- 

ані eT 
Average Arithmetic ED 53) p 


cte of Bit 


METROPOLITAN ACHIEVEMENT TESTS, 
ADVANCED BATTERY — COMPLETE: FORM S 


Age c C Am 


Book Company, Yonkers, N.Y., 1947.) 


title page for the advanced battery. 


55 
AP 
TABLE 3. Cio 

(Metrop? 
"WE. 
rit 
Name СА. IMA) LQ. | 1, | 2 | Ave | arin Ай 
TQ Read. |Vocab.| Read. Fund. p 
| i 6.9 
1. Abrams, John... 11-6 1128 | 110 | 10.1| 94 | 9.8 | 6.7 6.3 
2. Boyd, Ѕие......... 10 11-4 | 105 | б: | 7.7 | во | 6.0 6.0 
3. Cady, Arthur...... we m8] @ 80 ssl 62 | $5) ОШ 
"node ET UPS GI EET MSS 
25. Waters, Roy... 124 |116| 93 | 56| sa | 5.7 | 5.2 5.8 

Median of 25....... 122. 426 | 1834 o ms | es | 5.8 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 75 


to the level of his ability. By again examining the column called Grade 
Equivalent we find a variation from 5.4 in arithmetic fundamentals to 
10.1 in social studies. Here is a variation of over 4.5 grades, Craig 
undoubtedly needs special work in arithmetic. The first thing to do 
would be to go over his arithmetic tests with him—to discover in what 
processes he had made his errors and to enlist his cooperation in planning 
for his improvement. е 

А second way to study the results of testing is by means of the usual 
Class record sheet. In one which is before the author, there are 25 names 
arranged alphabetically like a class roll. For each child there is arranged 
along the top a record of C.A., M.A., and І.О. and grade equivalents 
in 10 different subjects taught in the elementary school: reading, 
arithmetic, English, etc. This furnishes three items additional to those 
of Craig Smith in Fig. 2. These are C.A., M.A., and Lo. (Table 3). 
One can now study the pupils’ grade equivalents in the nent е 
Chronological age and ability as measured by an inte gt "ca 
In this table, one child, 11 years and 6 months of age, has a grade 
equivalent of 10.1 in reading; while another child B y i e um d 
months scores 5.9 in the same subject. In the first шуш ре de hild 
that his М.А. is 12-8 and his I.Q. is 110. In the case of the aig О, 
his M.A, is only 10-8. It is thus seen that his oe era also the 
Closely related to the M.A. than the C.A. TED. Е soe or in 
&rade equivalent of each member of the entire class in i: ү ded E 
any other subject may be inspected. If desired, the pape кой 
reading may be computed. In this sixth-grade class, ka ec 
in reading vary from 10.6, the highest, to ш Ye rade equivalent 
of 6.1. You can readily see that the individual with the gr: q 


of 3.8 has a hard time trying to read sixth-grade materials. 


Rzconp $нккт* 


attery, Grade 6) ЭЕ 

А " 1 8 DE 9 10 pers 
res 5 : 0с. | Science| Spell. | Ach’t 
Ане. | Eng, | Lit. | Hist. | бей | Sids, |59109 

байа 

йл | 9.2 | 79 | 9.4 

6. 10.9 | 11.3 

s 10.6 10.5 er 5.9 $6 | 6:5 | 7.7") os 
sje їр A бо joes | 58 | ба | Ba | 64 
SE PARE $6 | $7 | 6t | 38 
5. Бб | SA 

y oe s às | Se | Ge | алта | О 


pany, Yonkers, NY. 


* 


By permission of World Book Com 


76 PROBLEMS OF MEASUREMENT 


ean 
A third method of understanding quickly what a 
hical representation by means of which a child's fd 
be сог so with the average of his class. In Fig. 3 appears su is 
~- "ain ear In the first place you will note that the pupil's wit 
87 while that of the class is 96.5. This low LQ. explains somewha ms 
ene on arithmetic problems and reading for ишы ape ide » 
greatest retardation of this child in comparison with the y eee ciet 
geography and history followed closely by vocabulary and e e 
The class profile shows a class low in arithmetic, good in English, 
poor in geography. | 
The fourth and final illustrat 
in Fig. 4, a normal 
cumulative attainm 


ion of the inter 
Progress chart. In this 


ent record for grades 4. 


ting record at the botto 
the weakness of usin: 


At the very beginning, in the Spelling co 
grown worse instead of 

in the record at the bottom we find he had i 
He had certainly dropp 


had improved absolut 


He 
ators use such records аѕ e 
For this reason the reco a 
n the pupil’s folder and entered on his cuni 
owth of the pupi 


. А n£ 
Problem Кӣ 
en 
ted 
уё 
I А d 
" 
x m hich the pupil’s stro 
be easily seen, 
ta, Programs 


vement 
national n 


й attained as compare 
orms more easily understood. 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 77 


INDIVIDUAL 
METROPOLITAN ACHIEVEMENT TESTS: INTERMEDIATE BATTERY — COMPLETE 


E 
Е 


last Роне... гооо line 
upil Profile.......++ 

Саз 10.94.5 Chronological Аде Il-4 
Pupil 14.87 Chronological Age 11-6 


DUET DESCEREUS. 


LESS 


руини 


oL. 


TUS 


ПОНИ ООУ 


\ 


| 


Grade Equivalent Scale 


Age Equivalent Scale 


at i has a tse te en nt th ts tt ot os ps as se 


DEEDT ERa 0 AN ON AD 


LECLELELE. 


Fig, p (intermediate battery), comparing the performance 
а 


ivi i ms of traditional grade equivalents. 
( ап individual with that of the class in ter 


y Permission of World Book Company. 


3. Completed profile ch 


78 PROBLEMS OF MEASUREMENT 


COORDINATED SCALES OF ATTAINMENT 


Normal Progress Chart 


A Cumulative Attainment 


Record for Grades 4.8 


Literature 
Percentile 


Sth Testing 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 79 


Programs of achievement testing usually begin with test batteries 
Which survey the various areas taught in the elementary school. If 
the records from such tests show that the average grade-equivalent 
Scores in some area, say arithmetic, are much lower than desirable 
then an achievement-test battery consisting only of tests of arithmetic 
15 used. Survey tests, then, furnish the general level of achievement 
together with analysis of the total into levels of various subjects. The 
Separate achievement test of a single school subject covers all areas 
of this subject in greater detail and provides, for this reason, greater 
Opportunities for analysis and diagnosis. 'The diagnostic test con- 
Structed after careful investigations of errors and misunderstandings 
offers still greater opportunities for analysis and diagnoses of those 
errors and misunderstandings which retard so greatly the learning 
Process, 

Generally speaking, then, testing proceeds as follows: (1) achieve- 
Ment-test batteries, (2) subject-test battery, and (3) diagnostic tests. 

"Or this reason, it seems logical to present discussions in this order. 
Our first treatment of standardized tests will be, therefore, of achieve- 
Ment-test batteries. 
DEVELOPMENT OF ACHIEVEMENT-TEST BATTERIES 

In the early days of testing achievement, there was an attempt to 
Dàrrow the function measured, so that its measurement could be more 
exact. Thus among the scientific tests first developed were Stone’s 

asoning Test in Arithmetic, Thorndike’s Handwriting Scale, Courtis’s 


tithmetic Tests, and Buckingham’s Spelling Scale. It is true that 
j я ore complex processes such as the 


€re were some scales involving m r t 
orndike-McCall Reading Scale, the Hillegas Scale for Measuring 
Omposition, and the Hotz Algebra Scales. But the tendency was 
toward simplifying or abstracting the function so that it could be 
Measured accurately. In all cases, there was no attempt to measure 
Several ar i test. а 
ertain practical difficulties developed asa result of this procedure. 
€ first place, it was expensive and time-consuming to administer 
five ог six tests ae different times. Furthermore, even if the tests were 
Ministered and properly scored there was no way of comparing the 
sults, say, of two tests. F me test such as vocab- 


80 PROBLEMS OF MEASUREMENT 


H Ig 

have four or five equivalent аш roe н 1 be used to study children 

i a period of several years. | 

oom ыыы: also characterized these earlier ee 
In E uen at the items were of equal difficulty. The ey ned 
number of items finished in a defined time. Not how hard but ho ei 
as the question to be answered. Examples of tests constructed on e 
were are the Courtis Tests of Arithmetic and the Ayres er 
Reading. The quality of the work was controlled by quen T 
the problems that were correctly done. In the second met = к 
items of the test increased in difficulty from the first to the las "y^ 
Time was controlled by allowing for the test ample time for : bet 
finish. The attempt was made to have some problems easy enough. t. 

all to make some score and difficult enough so that none would finis 


: Jl 
Tf there were several subjects who (1) made no score, or (2) finished & 
the items in the test, their scores 


were said to be undistributed. In e 
of the tests, such as the Woody Arithmetic Test, the items were ye Й 
fully scaled in difficulty by making use of the number of correct answe 
which an item received. Ani 


; While an item with only 5 per cent of m 
Scores correct was a difficult one. Today the great majority of er 
items are constructed according to method No. 2, i.e., they increase ? 
difficulty within the test. 


It was fortunate for the tes 
petent ind 


n al 
ommon characteristics, In all of them seve” 
subject areas are used and all 


all tests are standardized on the same pop” 
e to make dire, 
progress of pupils in one Subject and th: 
scales, several others. One could thus say with some degree of assurance 
that John's reading ability was definitely above his ability in t 

fundamentals of arithmetic, As these instruments developed two 
problems were continually being met. Th 
of the factor of age. One fourth grade m 
another, but the children of one might 


5 
of 
the other. It became apparent that b 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 81 


grade could automatically be brought to the desired level in attainment. 
Thus the problem of age needed to be definitely taken into account. 
The second problem of prime importance came in attempting to make 
comparisons between children on tests in which the items composing 
each test were significantly different in number. A reading test might 
have 50 items while an arithmetic problems test might have 10 
items. A score of 10 on these two tests would have a quite different 
meaning. Two procedures are most commonly used to meet this prob- 
lem. One of them is the use of the T-score, or standard score; the 
Second is the grade position. Many test batteries use both these 
techniques, 

The T-score really g 
Erew with scores from large popul 
that many of them grouped themse 


and fewer scores occurred as one went 
Mean in either direction. In short, the normal curve fitted closely 


Enough the scores thus arrived at. It was also recognized that the stand- 
ard deviation was (see page 507 for the statistics involved) the best 
indicator of dispersion or deviation from the mean. By putting these 
Wo ideas together there was developed the T-score with a mean of 50 
And a standard deviation of 10. In the original computation as developed 
Y McCall, 5 standard deviation units were used in either direction 
Tom the mean. This would mean 2 continuum beginning at 0 and 
Boing to 100. Progress of pupils could thus be measured in terms of 
Standard units which are as nearly equal to each other as any unit oi 
Measurement thus far discovered in education. It became apparent 
5 time went on that it was not necessary to have 50 as a mean anda 
Standard deviation of 10. One might use a mean of 100 or 150 and a 
Standard deviation of 20 with equal accuracy among the units. Semi- 
"erquartile or Q units have also proved useful. At any rate, raw scores 
transmuted into these T-scores, direct comparisons are made 
Еее the several tests, and profiles are drawn from them to aid the 
E * in comprehending immediately the total pattern of the de 
вау ортеп. Note especially that this condition holds only if all the 
are sta: "di. ате ро ulation. | | 
le hin ek eR a ete scores from ү үзү 
Merica] length is the grade equivalent. The grade equivalent ^ s 
М Vantage of being easily understood. À score of 10 on the arithmetic 
Toblem solvin t t d ht be accomplished by the average child in 
Brade 4 while v Miis of 10 on a reading test might be attained by the 


таве of grade 3. 


rew out of the standard score. As experience 
ations of children it was apparent 
lves near the mean but that fewer 
further and further out from the 


be transmuted into grade equivalents 


82 PROBLEMS OF MEASUREMENT 


Score Grade Equivalent 
Se eee 10 4.0 
SENS eiae edat e Exp er eta 10 3.0 


If, therefore, a child had a score of 10 in each of these two subjects; 
we can then say that there is a difference of one whole grade betwee? 
his ability in one subject and his ability in the other. The grade unit, 
while very practical, is not as accurate as the T-score; i.e., the growl? 
of a grade at one level of advancement does not equal the growth of a 


grade at another level. While grade equivalents must be used with 
caution they are high in practicality, 


Arithmetic reasoning 
Reading 


COMPLETE BATTERIES 


For purposes of study test batteries may be divided into two grouP* 
The first of these attempts to sam; 


ple nearly all the outcomes of tHe 
elementary schi 


any respects, F 


RA grades. For example, st 

are as follows: п and of the Stanford Achievement 
Metropolitan 

Primary I—grade 1 Stanford Achievement 3 

Primary II—grade 2 


. e 
Primary eng or grade 2 and grad 
Elementary—grades 3 and 4 Intermediate. grades 4-6 


Intermediate—grades 5 and 6 Advanced—grades 7-9 
Advanced—grades 7 and 8 and first half of 
grade 9 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 83 


i Two differences appear from the outline. The Metropolitan includes 
sts for grades 1 and 2 separately, and they have more batteries. The 
— in having a test cover fewer grades lies in the fact that many 
7а have been taught in the upper grades which the children in the 
E er grades have not learned. These unknown problems tend to dis- 
urage some students and make them feel that the test is unfair. 


CONTENTS OF THE Two TESTS 
Stanford Achievement 


Metropolitan 

на (Advanced Battery) (Advanced Battery) 

: Reading 1. Paragraph meaning 

а Vocabulary 2. Word meaning 

4 неа fundamentals 3. Language usage — 

ч rithmetic problems 4. Arithmetic reasoning 

p English 5. Arithmetic computation 
" Literature 6. Literature 

к Pd studies: history 7. Social studies I (history) 
^ SML studies: geography 8. Social studies п (geography) 
10. Cience 9. Elementary science 

* Spelling 10. Spelling 


Em general, the content of thes 
tent of the tests differs widely, 


k points. Neither has made specific pro- 
of low standing in any area and both of 
] information. In some cases small facts 
ociations and made into a test. The 


Ames of books and their authors, what the Vikings called their stories, 
^ ere the Po Valley is—these are samples taken at random from the 
0 anford Achievement Test. From the Metropolitan, samples are who 


ut eus was, what Arachne was skilled in, and who the first settlers 
a Saint Augustine were. Further general discussion about the strength 
Weaknesses of these tests will appear at the end of this section. 


Fe 
ү for diagnosing causes 
n lean heavily on factua 

lifted bodily from their ass 


Test BATTERIES OF FUNDAMENTALS 


The other type of test concentrates оп what might be called the 


damen ton. which must be learned whether one is à conservative 
the Progressive in his educational philosophy. The constructors of 
Scien, tests are skeptical about objective tests In literature, ана social 

E се. Many of them fear that the factual content which lends itself 
ing Sly to test construction does not represent the best outcomes of 
Р і ue that those areas where hierarchy 


т 
“ction in these fields. They a18 


84 PROBLEMS OF MEASUREMENT 


of habits prevail, as in language or arithmetic, can be most озер ет! 
tested. In the second place, these constructors might say that е 
of the great length of such batteries as the Metropolitan ms pom t 
to arrange techniques for satisfactory analysis ог diagnosis о йук с 
scores. They, therefore, would limit their testing to reading, M ads 
usage, arithmetic, spelling, study techniques and, in one case, dint 
writing. In this type of test special arrangements are made for = 
nosis and analysis of each area tested. Illustration of this type : 


UN а 
(1) the Iowa. Every-pupil Tests of Basic Skills, and (2) the Californ! 
Achievement Tests. 


Iowa EvERY-PUPIL TESTS OF Basic SKILLS 


Time, minutes 


p 
Test ne DN d 
Elementary Advance” 
A. Silent reading сотргеһепзїоп................, 46 68 
I. Reading comprehension.............,.. as 36 58 
нний 10 10 
В. Werksstudg Ша, а su cs аа gins жакы 47 | 7 
І. Map тєаййпщ........ 11 28 
II. Use of basic reference 8 | 5 
Ш. Use of index... L., 8 10 
IV. Use of dictionary 12 17 
a O tn ые wae 8 | 17 
уе Г so ose tos sev ny es, | 46 55 
І. Punctuation, 777777 saeva 11 4 
x UM eee ES 8 11 
Fe ен See eum ura e * 13 18 
"m a oe ms erm тте | 8 12 
н ана | б 
Bia ed RT ааа. 57 63 
I. Vocabulary and fundamental knowledge... |. T 12 15 
П. Fundamental operations, whole numbers, frac. | 
наи | 20 30 
De EIUS оона ооо е | 25 18 
| 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 85 


material in the same areas, with three exceptions: (1) instead of al- 

phabetization in Test B it substitutes reading graphs, charts, and 

; es; (2) in Test C it omits sentence sense, and (3) in Test D it adds 
9 Part II, percentage. 

it From the contemplation of this table and from the study of the test 
self it is clear that by sacrificing breadth this test has achieved depth. 


Its test of work-study skills is very complete and may aid greatly in 
A real aid in locating difficulties 


d strong and weak points. 
Curs in the test of the vocabulary and in the test of the fundamentals 
9f arithmetic. 
" o, us consider the reading test. In the advanced battery there are 
nly four sections to be read, but each section is made up of four or 
Ve paragraphs and covers a large page. There is room here for a unit 
i thought to be developed and an opportunity to ask questions 
volving both the content of the paragraph and the interrelations 
UM paragraphs. Further study of this test appears in connection 
Ih our treatment of social studies. 
A i California Achievement Tests limit themselves pretty largely 
thei € same area of testing as the Iowa basic-skills test but organize 

Sr material more nearly like that of the more complete batteries. 

ha, 215 test provides also a handwriting scale by means of which the 
n Writing of the words spelled may be rated. On the back of the 
cent її each pupil’s test paper there is a device to record the per- 
on x of errors in the various sections of the test. УК раде numbers 
T hich the opportunities for these errors occurre are written 1n. 


e 4 on : 9 is a sample. 
pages 88 aud 85 и s different from those of 


Оте of th 1 for testing are 
e procedures used lor B 
tt procedu he reading test of the elementary battery 


the с Ests. For example, in t 
n Ist part of the Par has to do with word forms. Are the two words 
€" or “different”? Not only do the words increase as to length 
Use Complexity but the printing varies from ordinary ыш the 
о М , i £ 5 
ete f capitals and italics опе word in capitals and the other in italics, 


Р ; F 
| Осарц] i - presented with the words opposites as 
E jure ep ta wing directions resembles 


a ir simi п follo 
an ; ith their similars. The test or | stigi еп: 
th s. Clligence test. There are from 18 to 21 different parts. It is on 
s errors is based. Perhaps these short 


Se н : 
Parts Гага that the diagnosis of 
IStics П which inferences are base 


ton: Of the + ; le, the t AN 
Pics ests. For example, ^^ > ix items. Punctuation 15 
teste and the test for using the index only six ite 


Р ly punctuated. 

On With only f tences which are to be properly P Е 
а © other hoa чре for the middle grades are overloaded with 
80 "tic fundamentals, of which there аге four large pages and 
Xamples, Worst of ali perhaps. is an English-usage test based on 


tute the weakest character- 


d consti 1 
able of contents test has only six 


PROBLEMS OF MEASUREMENT 


86 


әЗәцоә рит [ooqos үйүү 
*K19]jvq poouvApy 


6-L sopvi3 *&19139q әјттрәшләзит 


SLSX], LNSXAGASIHOV VINSOUIIV,) 


"utut og “шш oe E 
47 чэче че Е TeL 
Sunuapuvq ‘У 
8шцәйс `7 Senuspueg `1 
ypaads jo syed ‘Т 3шцәй `2 
sə9uə}uəS pu? Spl0M ^2 ѕәоиәзиәѕ рив ѕргом sppy uonenoung `7 
ешшт28 sppy uonnjoung ‘A sodA} әшъс̧ voryezgeidv) y 
"uu ze sad ourvg | "шш ог uonvznejd) :p | ‘шш cz "шш OT oeSenzuvT ‘< 
suo[qoiq `7 
шпеоціппу Н 
чорлАр 5ррү wonenqng 2 
‘uru ge səd4} ourvg sad; omits suorvuquoo uogIppy WZ 
"unu pp "unu pp sod oureg | ‘шш gZ sprjrourepung “цу Ф 
=шәдолд `7 
sjoquads pue susig ‘Т 
supqoup ‘A ѕшәдо:4 `2 sur pre ләдшпу '2 
suonenba pur sioqumN ^2 sjoquiAs pu? susig ‘g AMO ^g 
"шш og sodÁ1 әштс̧ so[ni рив sjoquids ‘g 1doouoo зәдшп ‘р soouanbes pue зәдшпм "E 
‘шш og sjdoouoo ләдшп "p | ‘шш 9r "шш zz urmos?2l “WY '& 
suogjezoidiu] ‘J 
suonvjoidiojuT `2) spys o2ua19JoW `2 s18; parus &пзәп(т 'Я 
SMAS IUJ "у | "utut ez sod) ourvg suorjanp Samorog “J 
"шш рє sodÁ3 ourvg | ‘шш ge suonooip Sumoo "ZT шш gz]  uoruouo4duzoo Surpeoy `Z 
91njvio]rT [e19uar) ‘A sojrsoddo jo Эштә `2 
IUIS [W208 ‘2 sonuv[murs jo Зшитәуү "(7 uonmni3ez ploy, 'g 
“шш 9] sad} ows aHUS 'g sod) ourvg шло pM ‘р 
"шш zT ѕәцъшәцзтуү ^p | ‘шш zr шшр] ArVpNGZIQa Зшръәҹ̧ 'T 
oui, 53093009 oum 51пәзиогу oui, $ju91uo?) oup, Suo? 
9-3 sapvis ‹ләјјтд Krejuouro[, £-T sopei8 '41o;)v] Auvuruqt 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 87 


only 10 items which check the difference in usage between such words 
as “did” and “done,” “those” and “them,” “seen” and “saw,” and 
throwed” and “threw.” 

The evidence points to competent diagnosis in the areas of arith- 
metic and reading but not in language. Even the inferences concerning 
Weaknesses in reading would be based on rather slim evidence when 
Individual sections are used. The results of analysis would only be 
tentative and suggestive, with nothing of the finality secured from the 
test as a whole. 

There are other test batteries which follow more or less closely the 
Stanford and Metropolitan batteries. The Unit Scales of Attainment 
(recently changed to Coordinated Scales of Attainment) furnish good 
tests at every level of the elementary school, as do the Gray-Votaw 

eneral Achievement Tests. There are also the Modern School Achieve- 


Ment Tests, 
Detailed accounts of the tests of school subjects appear under their 


Vari h ч à 
"lous headings in this text. 
EVALUATION OF Test BATTERIES 


. These test batteries furnish very important facts which are of aid 
™ guidance of pupils toward defined objectives and in the appraisal 
of the results achieved. The results of these tests when carefully given 


any Cored are more accurate and dependable than facts gathered from 
Пу other source. Nor are they lacking in comprehensiveness. Indeed, 
more formal defined out- 


* more elaborat 1 t of the 

orate ones sample mos 1 
p of the elementary school. Their norms, established so carefully 
fm Such large populations, furnish bases of reference not only for the 


test as a whole but for each of the areas measured. Thus guidance is 


1peSested not only from the results of the test as a whole but also from 
€ results of the single division. And when diagnosis is added to 


na, sis P А EM d 
Ё , guidance is greatly facilitated. T 

ese composite anes E help in guiding the transfer student, 
ЇаЦу if local norms are available for comparison. As a whole, 
tes Atteries (all possessing high reliabilities) are indispensable for 
hi "E achievement and for furnishing primary and supplementary 

8 for p: 
Buidance purposes. LESE. ae 

ex, 16 major FE ias of test batteries lie in the nature of objective 
Xamina ti уал tive test is able to test the capacity 
ча йү “тасв and to marshal them around a problem. 
Th idua] to gather facts ndividual to write a theme or 
of literature, social science, and 


Spec 
tes 


e і 
io ary science there is a stron 
ns of information rather than m 


PROBLEMS OF MEASUREMENT 


88 


суара эз 


EXE тех" 


xəpu јо 25((———— 
syuayuo> 10 adog Guuoops SALLY'S CQ 
зо ago} јо 25((—— — ^ EU VLT US CET MEETS T d 
Suzuoqoudjy— — *SONINVAW 121150440 72 
yooq jo зис JO NOILYLIYAYILNI `4 (Bunoay 100d әцоэірш Acn 51013) 
: = сәщо үү sBur 
z E ѕиоцээлр бшмојјоу -puo 40 spunos Joniu[ 
511115 32N3383i3 '9 рио ѕиоцішуәр Buipooy— — P AUNT pun 


:NOILIN902334. GYOM '8 


"sjuaA9 јо oouonbag—— —— 921042 ajduns 

OL ‘SL FU 521004 бшшпБәл ѕиоц2әла  — — Jood ojpoipur Аош 520219) 
jo иоцог1иобзо à $9301 

злоціпо jo uoiuao1duo) — — e | ѕиоцэәлр oduig — — БП сутана 

7 to 'C ^y H pu жайба 
ПТС $әзиәзәуи! *SNOILDIYIG -spion eit от 

Bupjow- 21312345  S9NIAOTIO3 '3 :WNOJ AYOM "^ 

EL ‘ZL '6 '8'Е 7277751903 porous d 
Apop  Guipuojsiopuj — — uoisuaga1duo? бшррәу 'z Aipjnqo20A бшррәу `I 
5DNIGVY33 


sways siojspuir |idnd ay} so jjo payray> oq ош suioj! SNODA Əy} ƏYA бәр ѕ,1әцэрә; әу; 
uo jdoy рио jopjooq 4594 Əy} шолу шоу Si SISAJDUD 2nsoubpip 510} у! рәдошшцә Ajayajdwod {ѕошүо s! J104 D4]xa 
yong шәріп Алрәц о jrdnd jonprapur цорә Билооу оз 4иәріош Угом [оэнәјә ou] punoj 4500 ayy ur ƏADY S494209} 
momo — зәңош әүшш о Ájjuonboij sı uononijsul үо!рәшәз 'әрош uoaq SOY sisoubpip ayonbapp ио o2u() 
“YON [Dipauua1 104 510 ә} 50 104209) әц Aq pa;j2at иә; o1D $э1й0} ә$әу Əy әчү рио 'siojDunuouop uow 
-w02 0} Buionpas 'sojoz JO asn 'BuiK11p ui рәрәәи Sı ио!уэп1у$ш! үо!рашәз1 jou JO JoujouA [Do4e! [JIM (19quinu Áq) 
js9] әчү JO u012os si) Ш sasuodsas /10]2DJs!jDsun əy} Jo uon2odsu! ир '(s[pjuouippunj 2Houn]iD ш иоцірро) 
"рәулош Ap10]rums 


а ?9S 'p 4591 ш јиәшәләцэр Á10420Jstjosun ѕмоцѕ o[youd 2usoubpip aul 4! ‘ajduipxa 4104 
Ájouo ousouboip əy} ш $419}42] Joyidp> рио sjosawnu ayy 


480} əy} jo Suonoas ayy oj puodsono2 515 

"uononajsu! poIpawas 10} 51509 D SD 

} БштАүоио puo бш/уциәрг ш 151550 ЈА 759] juouoAanpy aaissasBo1g Жлзәлә ио 

‘spjaly 40[ош әлош 10 ouo ш pippubjs әјдолѕәр D мојәд juou 

*sis[puo эц$оибо!р Bui^oJ[0J ayy 104 әѕп ou әлоц рім лә 
159) D yo o[youd эцѕоиботр Əy} 3| 


Ayjnayyip jo sasno> ayioeds əy 
озәцмәшоѕ siDoddp ys!yM ‘221A0P Bui^o[[oy oui 
-әләщэр soys o]o1d ousoubpIp Əy} 219^ ^1949A0H 
-4209j oy} Spj2!j [о ш ssa1bo1d |puuou Durjpur sı jidnd р yoy} sojpoipu! 


s31L102144Id DNINYVIT 40 SISATYNY 2ILSONSYIG 


Y WAO 'AUSILVg AWINAWATY ‘SISTI, INSAGIASIHOV VINYOJTIVD "p XISV 


89 


TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 


T 


* Жинабәт 
*ONILIVMGNVH 3 


19171345 са 
Burziu$o52g—— — 
зэдшпм 


592uojuas 


эу — 
9sua | — — 
әбоѕп poor — — 
S32N31N3S ANY SAYOM '2 
uonpnj»und-1249 — — 
SysDW uonsonQy 
Sy1Dur чоцоцопо 
$ошшогу 
Spouag———— 


*NOLLYNLONNA ‘a 
AJDVNONVI 


"5ә20]0 jo sautbN— — 
suosiad jo sawon 
£-]77792u84uà» yo pion js:4— —— 


*NOLLVZIVIIdVO "V 
әбопбирт `G 


э 
$^ 


эзәр jjo bunuto4 — — 
"U^ s[pui2ap 
©} suon2D1j Buionpay—— — 


ззәдшпи poxiw — — 
il 50012013 


зәршошәу — — 
"juanonb ш oong m= 


salgo 
*NOISIAIQ ‘9 


si0quinu dyourusouag 

Sjpus29p jjo Bui 
Sjoquinu oou 40 
poxu puo suouapi14—— — 


:104— — 


EL 'z['5u01304j jo uonojj33uo5 — — 


LL ‘OL 


*NOILYDITdILINW d 


saaguinu  ojouiuouag —— — 
еее suunjos 
ш sppun2ap бици — — 


$|ош!зэәр шоу 
$u01201j бицэоцапс—— 
7 слодшпи 
Рәхіш чил Buraonog 
£l '2177530{ошшоциәр џошшоэ 
о} suon2o1 Bupnpoy 
^$10j0129Umu 5ui301jqnc ——— 
""Kouowu би pongas 
$0127—— —— 
Bu140110g9— — 
Suonoumquos a[duic—— — 
*NOLLOVULANS * 
9jbunuouaq—— — 
"^ suun|oo 
ur $[ошзәр Bunuj4— — 
$[ошээр 
uo $UON303j бшрру — — 
" si2quinu 
рәх!ш бшрру 
5103 
-Duiuouap иошшоз 
0} suono} Buinpay— — 
7$104019umu Buippy———— 
Aauow Buippy — — 
ONIppo uwnjoy 
Buibpug 
7 soz 


DILIWHLIYV 


3 


Bti6piig— — 
""Usuonbumquoo — o|durs — — 
*NOILIdQY са 
S[p;uatopunj эцәшцүшү + 
оцоұ 
9604u22194— —— 
iu24u02 опо 
puo ainsoow gi0nbg— —— 
8'1'9'%бшбозәло рио бш:оцѕ 
"^ dogs-OA | 
d24s-309 ——— 
*SW31801d '2 
suono'AoqqV- —— 
suig ~~ 


:8108WAS ONY 5М915 "9 
1-217740ә2 19d рир "јошәәр 
'stoH204 jo 41d220—— — 


s1oquinu 
ajoya jo 1do2uo)— — — 
‘ssoquinu иошсу 


4ouou buyu — — 
ззәфдшпи Bui Ay 


11d32NO2 N38WüN "Y 
Bujuosvay эӊгшщиү р 


90 PROBLEMS OF MEASUREMENT 


. " e 
involve some of the higher thought processes. The application c ba 
ability to use the scientific method or the attitudes of pupils a 
portant questions which remain generally untouched. ОР 

This simply means that the test batteries in spite of their i 
do not tell the whole story. Supplementary facts must be gather 


o 
the whole progress of the child is to be evaluated for purposes 
instruction and guidance. 


Uses or Test BATTERIES 
The results of testin 


а Е Р е5 
5 а school population with achievement batter 
may be used in a vari 


ety of ways. 
The Administrator 
The administrator, w 
the total educational si 
duties from the results 
First and foremost 


А р OL 
ho must always exercise a broad overview $ 
tuation, finds 


scores derived from thes + 905 

In the second place, comparisons may be made between the n 
comparable grades in his System. For example, grade 5 in this ma”) 
facturing area may be 


compared with grade 5 in that better residen? 
area, 

In the third place, grade-equivalent Scores derived from aver? 
achievements in such i 


| res on the various parts. His intere? 
more specific and more immediate, apt 
In the first place, scores in the different areas of the test аса 


THE TEST 
ING PROGRAM—ACHIEVEMENT-TEST BATTERIES 91 


the tea 
| teacher wi : 
This e aras = standing of the class as a whole on each sub 
р sint ws mis wi he nee Profile variations of his class as ant 
that the ge ngth an weakness. Especially importa i = 
in Red parece trend is highly dependable. nS M 
ШО а. ia ат p be attributed to chance errors mem 
guage 14. is class is definitely back d es 
Ta ea this fact can be trusted. nes D apes КЕ 
studied. “ha place, profiles of each pupil should be mad 
the acquisitio a graph emphasizes strong and weak points and ет 
Here is, {оге n of sound information about the pupil as an тарап T 
is especially RN a pupil whose arithmetic scores are good but wh À 
ow on language and literature. The area needing xs i 
a 


attention ; 
th the is thus made apparent. 
the third place, teachers can fre 


1а be made of the errors which 
tals by means of the follow- 


ng device, 


Mz 
TROP! 
OLITAN ACHIEVEMENT TEST, INTERMEDIATE BATTERY, Form T 
(Analysis of errors) 


I. Items in addition 
A. Whole numbers: Items 1,3,45 


B. Decimals: Item 40 
C. Fractions: Items 23, 24, 25, 26 
D. Zero combinations: Item 2 
E. Mixed units: Item 52 

II. Items in multiplication 
A. Whole numbers: 11, 12, 13, 15 
B. Decimals: 42; 43 
C. Fractions: 32, 33, 34, 35 
D. Zero combinations: 14 
E. Percentage: 54, 55, 56 

III. Items in subtraction 
A. Whole numbers: 6; 7, 8, 9, 10 
B. Decimals: 41 
C. Fractions: 21, 28, 29, 30, 31 

IV. Items in division 


A. Whole numbers 
1. Short division: 16, 17, 18, 19 


2. Long division: 20, 


B. Decimals: 46, 47, 48 
C. Fractions: 22, 36, 37; 38, 39 
V. Graphic presentation: 44, 45, 53 
VI. Changing units of measure: 49, 50, 51, 52 


92 PROBLEMS OF MEASUREMENT 


With such details of w 


T А in 
eaknesses available, substantial changes i 
materials of instruction à 


nd procedures of teaching were made. 
The Pupil 

In the first place, the objectively 
attitude toward his work. The pupil 


education, It 


that they only gather dust. 
After the Ъ 


the purposes defi jen of t 
best tests to meet those nee Ded, the selection 
tests are selected, their details of 


THE TESTING PROGRAM—ACHIEVEMENT-TEST BATTERIES 93 


а = be reviewed with the teachers before these activities are 
ШЫГ en. Most difficult of all for teachers to learn is the process of 
tive ho ee for purposes of interpretation. Following this quantita- 
pi ee graphical arrangement of records comes the planning of 
pat S and methods for improving conditions found. This is the 
“a of the testing program. 

Progra general sequence of achievement tests ina comprehensive testing 
subject 1 Is usually (1) the achievement-test battery, (2) the individual 
enit est, and (3) the diagnostic test. In this text, therefore, achieve- 

к im batteries introduce our discussion of standardized tests. 

х Mi anis la batteries at the elementary level sample rather well 
бте nm outcomes of the more formal aspects of education. Since they 

dios ndardized on the same population, comparisons may be made 
Possible standings in the several subjects of instruction. It makes 

is | the study of levels of achievement of pupils, classes, schools 
Bron chool systems. The achievement levels of pupils may be used to 
mian within a class and may be highly suggestive of the types of 
ume, suitable for each child's educational progress. For these 
schoole achievement-test batteries have become customary in American 

QUESTIONS AND EXERCISES 


1, " 
Ing ый in considerable detail а test- 
Serj Am: for your school. Parallel the 
ў п in the text. What aspects of 
in th Bram do not seem to be included 
e text? 
Discuss the importance of derived 
lustrate Purposes of interpretation. 


Describe in detail the important 


Seo, » 
Ue 


Battery in respect to (a) area covered, 
(b) establishment of norms, (c) profiles 
of students, and (d) reliability. Secure 
samples and manuals and examine them 
point by point. 

7. What are the advantages for 
education of such tests as the Iowa 
Every-pupil Tests of Basic Skills and 
the California Achievement Tests? The 


рг 

E. ten lures necessary for administering disadvantages? To what usc can such 

“dead What does the author mean by & tests be put in addition to grade place- 
4 pan"? ment and subject achievement ? | 

Stang Explain what is meant by (a) 8. To what uses can the administra- 

© реге Score, (b) grade equivalent, tor put the results of testing? Illustrate. 
5 "centile score. 9. How can the teacher use the 

dures escribe three graphical proce- results of tests? The pupils? 

б Sable {ог interpreting scores. 10. How have the uses of test records 
Ment Отраге the Stanford Achieve- for purposes of educational guidance 
attery with the Metropolitan been illustrated in this chapter? 

Boon BIBLIOGRAPHY 
Cs, 9d Manuals GREENE, Havay hs ALBERT N. 
p : YMON i: 
Prset Lez J: Essentials. and J. Клумохр GER 
ork, га Testing, Chap. 12. New 


` rper & Brothers, 1949. 


of JORGENSEN, 


prmicu: Measurement and Evaluation 
in the Elementary School, Chap. XXI. 


94 


New York: Longmans, Green & Co., 
., 1942. 
Ча GERTRUDE Н., with the 
collaboration of Harold H. Bixler and 
the Division of Research and Test 
Service, World Book Company: Metro- 
politan Achievement Tests Manual for 
Interpreting. Yonkers, N.Y.: World 
Book Company, 1948, 
Iowa Every-pupil Tests of Basic Skills: 
Manual of Interpretation, Boston: 
Houghton Mifflin Company, 1940, 
Kerry, T. L., Girres M. Rucn, and 
І. M. TERMAN: Stanford Achievement 
Test (manual), Yonkers, N.Y.: World 
Book Company, 1940, 
Manual, California Achievement Tests 


for Elementary, Intermediate, and Ad- 
vanced Tests, Los Angeles, Calif.: Cali- 
fornia Test Bureau. 

Manual of Directions and 
tations, Gray-Votaw-Rogers 
Achievement Tests, Austin, Tex.: The 
Steck Company, 


Master М. anual, Coordinated Scales of 
Attainment, Batteri i 


Interpre- 
General 


Putttas, 
Standardized г 
Variability in Results from 
Achievement Tests, Duke 


tudies in Education No. 
N.C.: Duke Unj 


Tests," рр 


of Directions, Progres- 
Tests—A d, 


vanced Bat- 
tery. Los Angeles, Calif.: California Test 
Bureau, 1943, 


LER, ARTHUR E. Ад Study of 


PROBLEMS OF MEAS UREMENT 


the Revised Edition of the x 
Achievement Test," pp. 51-57, in deni 
Fall Testing Program in e oe 
Schools and Supplementary Stu oe 
Educational Records Bulletin No. Mi 
Vol XIV. New York: Educatio 
Records Bureau, 1942. Я 
Techniques of pum 
75-78. New York: Harper & Bro 
1945. е 
Wess, L. W., and ANNA -— 
SHOTWELL: Testing in the pe. 
School, Chap. XIX. New York: 
hart & Company, Inc., 1939. 


Articles 


D 
Foran, T. G. and M. EDMUMM 
Loves: “The Relative Difficulty . 
Three Achievement ————— a) 
Journal of Educational Psychology ( 
26:218-222. "-— 
5РАСНЕ, GEORGE: “Deriving 
prehension, Rate, and Асошасу the 
Reading Norms for a Short Form ° ding 
Metropolitan Achievement Reb О 
Test," Journal of Educational Psych 
(1941) 32:359-364, riso” 
PRAXEER, ARTHUR E.: Acampa e 
of Scores on the Revised Edition an seve’ 
Ider Edition of the Stanford A 
ment Test,” Elementary School Jo! 
(1942) 42 :616-620, TINÉ 
OOLF, HENRIETTE, and Сиз Шш 
Linp: TA Study of Some Pr 
cal Considerations Involved in HS rd 
of Two Educational Test Batter 5) 
Journal 


Educational Psychology ( 
26:629-634. 


CHAPTER 5 


Measurement of Reading, Spelling, and Handwriting 


K ge Б some logic in grouping reading, spelling, and handwriting 
school th one chapter. On many occasions in the elementary and high 
in writi ey appear in close interrelation, as when a child summarizes 
Constit ing what he has read. These three, together with language, 
omm ute the essential tools for further language instruction and for 
Unication, The tests of language are treated in Chap. 6. 
L READING 
сз Chapter the section on reading includes a treatment of both 
аге on] ary and high school tests. The spelling tests described, however, 
high > : those suitable for the elementary school. Spelling tests for the 
Dite coL are discussed in Chap. 6 under the caption Language and 
е; 
- e authors have considered reading as one of the receptive language 
What Out reading is certainly more than merely becoming aware of 
respon, 9n the printed page. Good rc 
tecop 9 Which are related to meaning. 
has горд this by affording opportunities to respo 
fen read. 


d reading always involves a variety of 
Almost all reading-test makers 
md correctly to what 


L [IMPORTANCE OF READING 
Scho, thing to read constitutes the major activity of the elementary 
Ani ailure to acquire adequate facility in this process is accom- 
anq 3 with the direst consequences in the upper grades, in high school, 
achi m life, Reading progress needs to be checked at every level of 
апау Ment to make certain that satisfactory results have been 
Dim. - One of the more difficult problems is to decide upon the 
um time to begin instruction in reading. 
A OBJECTIVES IN TEACHING READING 
the Р “arly as 1957 Gist and King stated clearly in a brief statement 
ma 1 These frequently quoted aims are: 


От objectives of reading. 
The Teaching and Supervision of Reading, p. 11. 


1927. By permission. 
95 


Gist 
New Sous S., and W. A. King, 
* Charles Scribner’s Sons, 


96 PROBLEMS Or MEASUREMENT 


(1) Rich and varied experience through reading. : 
(2) Strong motives for, and permanent interest in, reading. 


3 : ; d 
(3) Desirable attitudes and economical and effective habits an 
skills. 


(а) Development of well. 
habits. 


(b) Effective habits of intelligent interpretation, 


(c) Ability to use books, libraries, and other sources of in- 
formation €conomically and effectively. 


established fundamental reading 


3 into two parts—(q) attitudes, and (b) eit 
i ely tested. These objectives ne 
the difficulty of ¢ goals which are not clea 
defined or else defined vaguely, ities 

From the standpoint of defined objectives, the list of reading abili 
described by Horn and McB 
successful measurement. Th 
words, (2) to locate material 
is read, (4) to select and ev 
is read, and (6) 
with definite, me 


Я wW 
€y list the abilities (1) to recognize ы 
quickly, (3) to comprehend quickly W 


2 ей what 
aluate material needed, (5) to organize 
to remember what is read. Т 


TESTS or READING І TEST BATTERIES 
Reading tests constitute an inte 


t-test 
gral part of all achievemen 
batteries, Generally speaking, there j 


N ACHIEVEMENT- 


of such tests are (1) the Calif 
Iowa Every-pupil Tes 

The California Ach 
manual for reading, which 
from the battery as a whole 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 97 


р Vocabulary, information about а book, the use of an index, and tests 
Understanding paragraphs. The 90-word vocabulary test, answered 
"owes opposites to words, is divided into four parts equally dis- 
s е. among words needed in (1) mathematics, (2) science, (3) 
ocial science, and (4) general reading. Their tests on reading compre- 
ension deal with the ability to follow directions and, as the manual 
Ft It, “The test situations measure the students’ ability to (1) read 
e comprehend directly stated facts, (2) select the best topics or 
fad ral ideas, (3) make inferences and deductions from written material, 
E (4) read and comprehend the author's ideas as expressed in para- 
al (page 3). Reliabilities of the two parts and of the reading test 
ae. whole are about .90. It also furnishes details for constructing a 
Bhostic profile. 
аў he reading test of the Iowa Every-pupil Tests of Basic Skills is 
for *d Test A: Silent Reading Comprehension. The advanced battery 
Brades 5, 6, 7, and 8 is divided into Part I, Reading Comprehension, 


a 

"d Part II, Vocabulary. 
these 50-word vocabulary t 
“ess Brades, Such words as “4 
; tial? are set in multiple-choic 


ests includes many words suitable for 
* desirable," "indefinite," "civil," and 
e items. The tests of comprehension, 


iu ате of the work-study type, are excellent examples of test con- 
„ction, In the first place the selections are much longer than usual, 


each o бе fi h 
ne filli d consisting of three to five paragraphs. 
I Ing a large page an concerned with material little 


Addition, thei ject matter is 
W to the mm Such selections as “The Boomerang,” “The 
hey e," "Billy Sunday," and “Тһе Northwest Passage" constitute 
tio Material for most readers in the upper grades. ш of the ques- 
bu 8 are informational, with the answers contained in the paragraphs, 
Cxa ПУ of them call for understanding and Ыт One 
© must suffice: in one paragraph there is à qabel escription 
frg, © Shape of a boomerang but the question asks the EE er to select 
The Our visual shapes the one «most nearly the shape of a boomerang. 
a hension, has been 


le in hi f reading compre 
Una Ог, in his search for tests of TC? Н 
ot le to fing another test as good as this one. It tests (1) the meaning 


ing of paragraphs, 
anq (4 S, (2) the meaning of sentences: (3) the meaning ої paragrap 


) the : hs 

Т agraph». å " 
ae e epee тезү Т tests of reading readiness, (2) tests 

“ading е нао and (3) tests of reading diagnoses. 

" 
Tests of Reading Readiness A 

value for guidance in beginning formal 
ber work, they do not predict final 


Qtr - Intelligence tests are of 
of reading readiness. There was 


с Н 
Achie, ‘on in reading and num 
ment marks as well as tests 


98 PROBLEMS OF MEASUREMENT 


needed an instrument which would test specifically those Sons d 
which instruction in reading depends. It is these instruments v 
to be described. | 
es re ise see tests were constructed to measure precisely € 
traits which are required to learn to read. Careful analyses were -— 
of those traits which reflected clearly the maturing process. Am 
these traits were the following: 
1. Language growth 
. Correctness of language usage 
+ Interest in learning to read 


2 

3 EH 
4. Visual and auditory discrimination and reasoning ability 
5 

6 

T 


- Knowledge of facts and events in common experience 


. Number information 
- Motor control 


1. Metropolitan 

2. Gates Reading Readiness Tests 

3. Lee-Clark Rea 

4. Readin, 

5. Steven re 
One some length as a sample, and pe 
will follow a discussion of the value and use of reading-readiness e 
in teaching and guidance. 


arion Monroe 
5 Reading Readiness Test 


varies from two boats (like), 


e 
place, two-place, and th: 


P i n 
to an ellipse and a circle, to pairs of 07, 


о“ 


SPELLING, AND HANDWRITING 99 


MEASUREMENT OF READING, 
TEST 1. SIMILARITIES 


n 
A 
U 
2 
OD 
es 
GA 


216 

no 
boy 
flies 


chair 


1) 


100 PROBLEMS OF MEASUREMENT 


recognized from drawings, each word from four drawings. Words such 
as "key," “desk,” “bridge,” “jewel,” “blossom,” “bonfire,” ' 
sect,” and “poultry” are used. Words like “moccasin,” ‘chariot, 
"insect," and “poultry” are at the more difficult end of the scale. 
Fig. 6 shows sample pictures. In No. 8 the word is “lantern "iin No. 12, 
the word is “melon”; and in No. 17, the word is “insect.” 


© pumpkin in the window ” or « ee veadin& 
uo dod or “Тһе man is rea 
а book, or The man at the drugstore le thin d. He sell? 
medicine and things for Sick people.” gs we need. 
Test 5 is a very complete sam 


t board"), th uu c 
and triangle from their names, to write [Nr жы. aga бй ; 
‚ 7) 


to understand how to count seven and thirteen, to know something 
? 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 


TABLE 5. Tests or READING READINESS 


Intelligence-test 


101 


Relia- Achievement 
та bility correlations correlations Contents 
Gates Reading 97 Gates Primary | Pintner-Cun- Test 1. Picture 
Readiness (N,174)| Achievement ningham (Gates| directions 
(group), test — .706 Reading Read- | Test 2. Word 
Teachers Col- iness + Pint- matching 
lege, Columbia ner-Cunning- | Test 3. Word- 
Diversity ham) = .76 card matching 
| Test 4. Rhym- 
ing 
| Test 5. Reading 
letters and 
numbers 
Metropolitan 83 to .89 *Pintner-Cun- |1. Recognition 
eadiness Tests А ` ningham = .53 of likeness and 
(group), (N, 94); difference be- 
World Book Detroit First tween forms 
Отрапу Grade Intelli- | and letters 
gence Test = |2. Copying 
70 (N, 34) figures 
*Combination | 3-4. Compre- 
of 3 intelli- hension of 
gence tests, .79 words and 
phrases 
5. Number 
knowledge 
6. Common 
St knowledge 
tevens Read rs! Rat- dis Recognition 
Жы. 96 pet of objects and 
group), World Achievement Ie different 
ok Company = 80 (N, 460) GRISE 
after 10 cae | 2. Recognition 
reading In- | of words and 
struction | phrases from 
ie among others 
€e-Clark A -| California Test Test 1. Match- 
ing Rc Read .92 Lec-Clark Read Өш К ioe eat 
Test oar (N, 170) eee _ 61| Maturity = -65| bols | 
li Ornia Т Р Test 2. Crossing 
Bu est out letters 
different from 


others 
Test 3. Vocabu- 
lary and follow- 
ing instructions 
Test 4. Identifi- 
cation of letters 
and words 


a _ NENNEN Locri 


102 


PROBLEMS OF MEASUREMENT 


TABLE 5. TESTS or READING READINESS (Continued) 


Test 


Relia- 
bility 


Achievement 
correlations 


Intelligence-test 
correlations 


Contents 


Monroe Reading 
Aptitude Tests 
(partly group, 
partly individ- 
ual), Hough- 
ton Mifflin 
Company 


Betts Ready to 
Read Battery of 
Tests (individ. 
ual), Psycho- 
logical Corpora. 
tion 

Van Wagenen 
Reading Readi- 
ness Tests (in- 
dividual), Edu- 
cational Test 
Bureau 


87 


94 


Gray’s Oral 
Reading Test 


and Iota Word 
Test = .75 (N, 
85) 


Reading tests, 
end of grade 1 
= .73 


Group Tests 
1. Visual T 
a. Identifying , 
forms and Ше! 
positions 
b. Tracing 2 
maze | 
c. Drawing а 
picture 
2. Motor . .— 
a. Dots in circ? 
b. Keeping 0? ; 
line with pe” 
3. Auditory —. 
a. Recognizing 
correct pron" 
ciation | 
b. Recognizing * 
word sounge 
out phonet- 
cally 
4. Vocabulary 
Preschool 1 
through Co^. 
lege. Phys? 
logical and К 
psycholog!© 
tests tne 
1. Range ої!" 
formation " 
2. Perceptio 
relations 
3. Vocabulary 
4. Word dis 
criminatio? |j 
5. Memory 5P 
for ideas = 
6. Word leat? 


кее каре сс  —ÉÁÉENMLAM 


simple ordinal numbers, t, 


traction and addition. 


Test 6, a test of inf 
nition of common ob 
mark “the thing to carry w. 
better," and “the thing in whi 


ormatio + o£ 
jects a 5 questions which involve the rec {0 
n Pictures. The child is aske se? 
' “what helps people 0 
55 the ocean.” 


g four 
hen it rai 


or i 
€cognize 1$ 


. 0р“ 
» and to do the simplest $ 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 103 


p E ks problem of drawing a man and of writing one's name. 
T onan re s the content of these tests it is evident that they contain 
feat. "The um precisely like reading than does the general intelligence 
a ded € the great advantage of being analytical and of furnish- 
печтин o weaknesses in certain areas. Weakness lies, let us say, in 
Teachin Ai in visual discrimination or in the combination of words. 
levier g rim can be directed toward the points of weakness and at the 
las м ievement. Percentiles are furnished for each test. From this 
ee as а as to when to begin reading can be drawn from the 
Which whole as well as to the area and magnitude of the weakness 

The ше the child from being ready for formal reading. 
тёз : ances for success in reading are calculated for each level of 
et е, оп {һе Metropolitan Readiness Test, and a critical 

Dum edge a below which the chances of success are small indeed. 
fue other factors enter into success besides what can now be 
tentativ. ‚ this notion of chances of success aids the teacher in making 

Єзї. е any grouping of children based on the scores received in the 
i caielligence tests and tests of reading readiness together furnish 
latter (c lon highly predictive of subsequent success in reading. The 
Specia] sts especially break down the total aptitude for reading into 
cedure areas where modifications in programs of materials and pro- 
whethe can be made. Definite conclusions can thus be drawn as to 
Gui Дат. or not a child is ready to begin formal instruction in reading. 

ce of the finest kind can thus be rendered at the very beginning 


а School career. 
Able 5 contains a list of tests of reading readiness. 


Tests of Reading Achievement 


Ach: 
Chievement tests in reading offer a much larger sampling and a 
iety of reading situations than 


] testing program the test 


fact, 
o 
readin, results might be discovere 
шаі arithmetic, language, etc. Let us suppose that опе 
for ; "actor areas is reading. Before un i 


Y "gel H 
™provement of the children's reading abilities it 
From such a test, there may be 


obtaj, co prehensive reading test. А i 
ilitie (1) a more dependable report of the children’s general reading 
haye eS, and (2) more analysis of the difficulties which the children 
Ш reading. In the secondary school where test batteries have 
ievement tests of reading have been of 


Not b 
e 
"i too satisfactory, achi 
eat value. It has been discovered that poor scholarship in several 


104 PROBLEMS OF MEASUREMENT 


subjects has frequently been due to the students’ failure to form satis- 
factory reading habits in the elementary school. 


Reading Tests in the Elementary School 


The importance of reading has been so universally acclaimed that А 
large number of instruments have been constructed for its mensure 
ment. Gray’s tests of oral reading, described on page 108, have prove 


whom fellow 


wheel mail 


banjo bandage 


blanket bandit 


neglect saddle 


needle seldom 


à Е У reading tests, 
А good illustration of а test of achievement which also offers variow® 
opportunities for diagnosis and guidance is the (ares ea us Readiné 


MEASUREME G PELLING, A ANDWRI G 
NT OF READING, S NG, AND HAN N 105 
Л 


Tests. in 
(2) tests are divided into three types: (1) word recogniti 
i5 tol er and (3) paragraph reading. эң; 
presented in ^ с test of word recognition, a picture of an object is 
ats four rnt т eit part of a drawn box. In the right section there 
не ес words, one of which is the name of the picture. Th 
say, “I want you to look at the first picture. Next to ise 


g? 


7. Do you like to go camping? It 
is fun to sleep in a tent. Draw a 
line under something you might 
take on a camping trip. 


13. A pumpkin with a funny face 
stands for Hallowe'en and a lighted 
tree for Christmas. Easter brings 
Bunny with her basket of eggs. Put 
X on what stands for Hallowe'en. 


Fig 
: 8. Gates Primary Reading Tests, short sentences, Items 7 


and 13. 


You are to 


words goes with the picture. 
? Figure 7 


are 
dra, "€ words. One of the 
d that tells about the picture.’ 


Sho hg around that wor 
fe ee from the test. — . 

s те altogether 48 items; and 15 minutes 1s allowed for the test. 

Pick 70198 as “sit,” “hen,” “bear,” cock," “stand,” “crow,” 
Ths window,” “leaf,” “lake,” “roof,” and “drive” are included. 
cond part of this test, Type 2, contains short sentences printed 


of Publications, Teachers College, Columbia 


106 PROBLEMS OF MEASUREMENT 


ge . . "n ives two 
out with appropriate answers indicated in pictures. Figure 8 giv 
illustrations. 


“ is is 9 
i i lexity from ‘This 1 
The sentences increase in length and comp etty 
hat,” to “This bottle is full of ink,” to “The young daughter has pr 


Е cing the 
clothes.” There are 35 items, and 15 minutes is allowed for taking 
test. 


ST. ilt along 
The third part of the test, Type 3, paragraph reading, is ael; ек 
the same lines as the first two but has longer passages to rea 


SET I—No. 2 


А boy had a dog. 

The dog ran away. 
The boy ran after him. 
He ran very fast. 

He caught the dog. 
He took him home. 
The boy said, 


“You are not а good dog. 


You must Stay at home," 


е 
interpret. The co 


Jin 
Paragraphs vary from “Draw a caf 
her told her boy to jump into th rap” 
m the boy to the car,” to a para& for 


tences. Twenty minutes is assigne 


*. The words were selecte 
т ?0k so that usefulness 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 


SET II—No. 1 


A nest is in a big green tree. The 
mother bird made the nest. She put 
it on the branch of the tree among the 
pretty leaves. She made it of twigs, 
leaves, and grass. She put soft rags 
inside of it. The nest has five baby 
birds in it. 

The nest is large and round. The 
little birds will not fall out. The nest 
holds the mother bird and the little 
birds, too. It is hidden under the 
leaves. The old cat cannot see it He 
does not know where the birds are. 
He will not find them there. 


The nest is the home of the birds. 
It is a bed for the baby birds. The 
wind rocks it back and forth. The 
nest is very strong and the wind can- 
not blow it down. The little birds 
eat and sleep all day. They will learn 


to fly very soon. 


d with great care 
for work in the ele 
Norms are furnished for tests given both at th 


107 


from Thorndike's 
mentary school was 
e beginning 


108 PROBLEMS OF MEASUREMENT 


" ull. 
d at the end of the terms, as well as for the bright, medinm, аш ыа 
Т} liability of the test seems to decrease with the grade. Fro B nid 
n wn 6 figures are .86, T1, .72, and .52. It is clear that ee 
E d most value in grade 1. The correlation with teachers' esti 
of ability to recognize words is .74. 
The third instrument use 
grades is Gray's Oral Read 


are four sets: 
Set I. First grade 
Set IL. Second and third grades 
Set III. Fourth and fifth grades 
Set IV. Sixth, Seventh, and eighth grades. 


Я "hose 
Each set has five samples of approximately equal difficulty. Th 
illustrated on pages 106 and 107 


The sun Pierced into my farge windows, 


ber, and thegky was а dazzling blue. I looked out of my window(and) 
down the Street. Tho white hous) of 


the long, stfhight street were 
St painful to the eyes. The clear sijosphere allSwed full play fo, 
the sun's brightness, 


It was the opening of Octo- 


If a word is wholly mispronounced, underli 
"atmosphere," If a porti i Pronounced, mark appro- 
Priately as indicated above » “pierced” Pronounced in two 
omitting the s in “houses,” the 
-" Omitted Words are marked as in 
and “апа”. ituti n the case of " many" for 


Tepetitions as in the сазе 
0 | A" Two or more words Should be Tepeated to count as а 
Tepetition, 


€re it is needed. stable 
{ ere ате а great many tests sul а 
for testing reading in the Intermediate grades, Among these, tw0 3 


5 
: 55 Tests of Silent Reading (god 4 
to 8) and the Iowa Silent Reading Tests, elementary test (gra 
to 9). 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 109 


Gales Silent Reading Tests 


The Gates Silent Reading Tests are divided into four parts:! 

Type A, Reading to Appreciate General Significance consists of a 
Set of 24 short paragraphs, to be read for 6 minutes for accurate 
general impression. The reading required resembles that used in 
Casually reading a novel or newspaper. An item of about medium 


difficulty is used here for illustration: 


10. The dog ran to meet the man coming up the path. He wagged his tail joy- 
ously and barked with short, excited barks. The man leaned down and patted the 
“0g on the head. ‘Then he rolled up the paper that was under his arm and gave it to 
x dog. The dog ran with it up the path toward the house, his tail wagging allthe 
ime, 
Draw 
lonely 


à line under the word that best tells how the dog felt: sad afraid 


меагу happy 

he Outcome of Given Events, is also 
Composed of 24 items, to be read for 8 minutes. This kind of reading 
Involyes analysis of what is read and a thinking of the facts together 
? Predict the outcome of the events described. An example is: 


Type В, Reading to Predict t 


Uy Pat Dolan lived in a crowded part of New York City. His parents were very 
Poor, What money he earned selling papers he gave to them. One day a woman gave 
Laer e on a big green bus. He could hardly 


Im a A 
ù а quarter, Pat had always longed to r! 
Maipo, cS LER Pak Hadian etog 1 or sell papers. At last Sunday 
ше T Sunday when he did not have to go to schoo par 


Pat bought a toy dog with a squeak 
© Went to church in his father's car 
© took a long ride on a big bus 
© Sold a hundred papers that day І 

Ту d Precise Directions, consists also 

pe i derstand Pre 3 y Consist 
: ка ы er p 8 minutes. All items contain pictures 
e A 
Ich are m; та in some жау indicated in the paragraph. The correct 
readin enone volves “rigid, careful reading” (Fig. 9). 
Ty A etai udes 18 items, to be read for 
8 minuga? Reading to Note 5 hend several points in a paragraph 


10 wers. Among those that can be 
fou с E ae шыш шз ^ е бсан й aster. ‘Think of the color they 
sive to the early fall are er Le » tells that these two flowers were once twolittle 
Bitls ү : ees of the hills. eS s happy. $0 а fairy changed them into goldenrod 
anq asters anted to make everyo 

| ons, Teachers College, Columbia 


1 H 

Xt blicati 
Universi used by permission of Bureau of Pu 
tity, New York. 


110 PROBLEMS OF MEASUREMENT 


When are goldenrod and asters found? 
Spring Summer Fal ^ Winter - 
What does the story say these flowers were once upon a time 
Stars Girls Sunbeams Boys 
How did they want to make everyone feel? 
Gay Excited Young Happy 


Inspection of the characteristic 


š yill 
5 of the four types of reading " 
convince anyone of the close relati 


RES ; Дип 
onship between objectives in teac 


11. Some things 
things grow in the ground. Here js 


grow On trees and some 


а walnut, a banana, and a beet. 
walnuts and bananas ÉTOW on trees, and 


beets grow in the ground, Draw a line 
under the ones that grow on a tree, 


SOS go throu, 
БО across, үү 
bridge that 
Bets near, 
Fic. 9. Gates Silent Reading "Tests 
directions, 


reading and these т 
for each of the four t 
for the test аза wh 


Ч А cored 
fasuring Instruments, The tests are easily 50 tp 
Ypes—A, B, C, D. Since norms are furnished 7, 
ole and for the 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 111 


difficulties in reading. А pupil also may try only a few items but get 
them all right, or he may attempt many with only a small number 
Correct, or he may follow some middle course. We may thus discover 
а slow, laborious reader, а rapid, haphazard sort of a reader, and one 
who reads at a normal rate with normal success. All these facts aid the 
teacher in her attempt to provide materials and procedures for improv- 
ing the reader process. Gates has furnished suggestions and further 
reading for improving poor reading as indicated by scores in each of 
the four types of the test. : 

The Gates Silent Reading Tests have been widely used. There are, 

Owever, three important limitations of the tests. In the first place 

ere is no experimental evidence that the four types of tests actually 
Measure the outcomes of instruction. In the second place, the para- 
8taphs are short and in many cases too easy for children of the upper 
Brades, Since there is little gradation of difficulty the test tends to 

come a rate test. In the third place, the types are certainly not in- 
dependent, Table V of the manual shows correlation ranging from .66 


© .92 bet. ived from the different types. The correlations 
of NVEETLECOTES тесев е .80 in 14 out of the 15 coefficients. 


ype A with Type B are abov 1 

Such high erates indicate a tremendous amount of overlapping 
€tween the types. 

bs € Iowa Silent Reading (us 
ades 4 to 9. Tt made use of the objectiv 
eld to build a test which would reflec 


Tests, elementary form, are suitable for 
es described by experts in the 
t satisfactorily improvement 


Tests: ELEMENTARY TEST 
Тавгь 6. Iowa Senrt Reapine TEST Reliability 
.83 (rate) 
.68 (comprehension) 


"Tes 
t1, Rate and comprehension 
Science material 


"Test 2. 


Directed reading 
Science material 


Jpn Nocabstudiem НЕЕ LL quocp PERSE 


Word meaning 
General vocabulary 


Test 4, 


paragraph à 
o the meaning of 


aragraph comprehension 
Selection of central idea of par? 
Identification of details essential t 
ы ТЕШ Жа с 60 
еъ б сепсе meaning... 550777777 
*shGcationofinformatiom = ^ ы ur 
Alphabetizing; using guide words... i ; | pn 3 

ВГА ace Саи ^ 


Cip eum co ee канын qn onum gras RR E 
tan standard score... 777 


112 PROBLEMS OF MEASUREMENT 


ength is its 
in each objective (sce page 17, Chap. 2). An element nares е) by 
four equivalent forms. Tests for most of the goes Wie тез, W 
Horn and McBroom аге provided. The divisions o «in Tele 
reliabilities based on 220 cases in grade 6, are as oe авай 
This test of silent reading of the work-study type includ es directe 
four to five paragraphs in length to be read in the section "ts of ques- 
reading. In the section on paragraph comprehension two so nd (2) o? 
tions are asked: (1) on selecting the topic of the paragraph, a 


No. 
Ricut 


TEST 4 
Paragraph Comprehension 
Central Idea А 


Development В...... +С 


TEST 5 
Sentence Meaning 


TEST 
Location of Information 
A. Alphabetizing 


B. Use of Index 


Fic. 10. Profile chart, Iowa Silent Reading Т, t è 
World Book Company.) ачын "ашу 


Ї 
5 [) 
). (By permissio? 


h other are Satisfactory, They range from 18! 
lations in the 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 


113 


Шешу computed from 1,600 to 1,900 cases gathered from 19 widely 
ifferent communities located in 13 states. 


To comparing the Gates Silent Reading 
eading Tests, both widely used, we find th 


Tests and the Iowa Silent 
e Gates less difficult. This 


test is less rigidly standardized than the Iowa and much of its reading is 


of short paragraphs. It has an in 


attempted and a fine score of accurac 
e includes sentence meaning and t 
test ained in the Gates. There is less over 
due It, however, does not have reading 
ois in the reliabilities of the 
ities test can be profitably used, wit 
Brades 3 and 4, and a vote for the Iow 
ere is a selected list of reading tests suita 
EMENT TESTS IN READING 
ENTARY SCHOOL 


LIST OF ACHIEV. 


FOR THE ELEM 


tis - Iowa Silent Reading Test, elemen- 
У, Grades 4-9, World Book Company, 
Onkers, N.Y 
Wo; Detroit Reading Tests, grades 2-9. 
Book Company, Yonkers, N.Y. 
1-2. Gates Silent Reading Test, grades 
College, Columbia 


Reading 


Angeles Primary 
Test 


ше Brades 1-3. California 
^u, Los Angeles, Calif. 
* Emporia Silent Reading Test, 


er 


Separate tests of arithmetic, of s 
he measure 


Греаг in the chapters on t 


S > 
“ence, and of English respectively- 


Reag; 
fading Tests at the High School Level 
he high school level. Among these, 


T : 
thr here are many reading tests at t 


ium 
* Will be mentioned: 


dication of rate in the number of items 


y. The Iowa test, on the other 
he location of information not 
lapping among its constituent 
for prediction. There is little 
tests which compose the total. 
h a preference for the Gates test 
a test in grades 5 to 8. Included 
ble for the elementary grades. 


grades 3-8 (survey). Kansas State 
Teachers College, Emporia, Kans. 

6. Traxler Silent Reading Tests, 
Series I, grades 7-9. Public School Pub- 
lishing Company, Bloomington, Ill. 

7. Monroe Revised Silent Reading 
Tests, grades 3-8. Public School Pub- 
lishing Company, Bloomington, Ill. 

8. Sangren-Woody Reading Test, 
grades 4-8 (survey and diagnostic). 
World Book Company, Yonkers, N.Y. 


ocial science, and of language 
ment of mathematics, of social 


ny, Bloomington, ш. 


l Towa gi А . Publishing Compa 
vanced). Silent aem ge Tests A 3. Cooperative Reading Comprehen- 
b acd 10 ape "T sion, grades 7-12. Educational Testing 
n k Ys ^ а 
Series axler — pede Tests, Service, Princeton, N.J. 
Ц, grades 10-12. Public Schoo! 
Th fidely used. The advanced 
eI i х ts have been widely ivan 
teg, ү; Оа Silent Reading Tes ner as is the elementary. The divisions 


i 
of the Composed in the same man 


© tests with their reliabi 


lities appe 


ar in the following table. 


114 PROBLEMS OF MEASUREMENT 
Test 1. Rate and comprehension 
Science material 

Social-studies material 
Test 2. Directed reading 


sss 273: (rate) 7 
E EEEE НИИ ^ (comprehension) 


Mn RA .91 
Test 3. Poetry cpu C OC NAMEN MM .80 
улгш єк ин Ан NÉ ‚90 
Social studies 
Science 
Mathematics 
English 
Test 5. Sentence pan a И .85 
"Test 6. Paragraph comprehension 
Selection of central idea of paragraph assas aaus or .54 
Identification оғ det 


ails essential to the meaning of 
the Н Ll 


ЖОР WO cis „13 
"Test 7. Locating information 
Use of index. .. 
Selection of ke 


Bite BONG P ions кай .82 
о Bay 6 sonics 91 


be read are com 


А ага 
Posed of four to eight e 
king questions involving ©. 


E 
5 : nce Vocabulary of 15 words are hae і 
adhere," anq "latent." The mathematics Vocabulary is compos? Jes 
15 terms, of Which “degree,” “ori 
The fourth divis; 


Origin," and “linear” are samP ch 


Огп and McBroom ( 
computed from over 10,000 cases. It 


S raw scores are transmuted d ed 
cores, and these, by means of a table, are of 
into percentiles Provision is made on the ront page of each te? рі? 
constructing а Teading profile of the seven divisions of the tes ү 
test is a satisfactory silent-reading test of the work-study tyP® gl. 
material is somewhat academic and the techniques a trifle pet 
The test has the possibility of Tewarding to i 


noses 4 wit? 


соу! 
enged that weak points are disc acy 
5 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 115 


and understanding in the material read. When these critical areas 
ате discovered programs of study aimed at a narrower function may be 
Prepared. By this procedure more rapid progress is assured. 

АП analyses of reading difficulties stem from an understanding of the 
reading process itself. This process is much more complicated than at 
first appears. On the one hand, it depends upon clear perception of the 
Symbols involved; on the other, it depends on the association and 
relation of these symbols among themselves as well as their relations 
to life experience. Good reading involves the formation of a hierarchy 
of habits which unifies several words into one idea, but the correct 
idea depends upon the accurate perception of the words themselves. 
Unless the words are accurately perceived reading may become an 
‘Maginative procedure in which words are added or subtracted with 
impunity, This nicety of balance between analysis and synthesis must 
icd. maintained. 

His. the case of many children w 
Clarity of perception, in word 


ho become poor readers, failure in 
analysis, or in that unity of parts 


Tom which meaning derives is apt to take place. In some diagnostic 
reading tests, the emphasis is on perception; in others, on the relations 
s, Words; while others attempt to discover the point where the lack of 


meaning or recall appears. A few tests attempt to make a complete 
alysis of the individual's reading, with enough samples at each critical 


oint to determine the exact location of the difficulty. 


of hearing, and (2) per- 


facts gathered from the home which relate to possible sibling 
Ay ich pas inant hands, emotional reactions, and special 
int, ^» Change of the domina 


| ere { is of reading furnishes a check 
list Sts. In the second place, this e pw mauve і 


9t difficult ich is quite inc E Г j 
ftom the аа нят be regarded as а diagnostic record sheet: 
L 


` Background skills 
faring vocabulary poor 
Ер comprehension poor 
2. yiulty voice or speech habits 
ord mastery skills 
ord recognition 
OW sight vocabulary 
Will not try difficult words 
Can spell but not pronounce 
Bnores word endings 
] form 
iW s at words from eo pee 
оок Company, Yonkers, ^^ 


tiva] 


(determined by previous test) 


116 PROBLEMS OF MEASUREMENT 


3. Word analysis 
Word analysis ability poor 
Will not try difficult words 
Has no method of word analysis 
Sounds aloud by single letters—blends— syllables 
Unable to combine sounds into words 
Looks away from word after sounding 
Sounding slow or inaccurate 
Spells words: successful—inadequate 
Silent word study: successful—inadequate 
Enunciates badly when Prompted 
Systematic errors (tabulation of them) 
Names of letters not known 
Sounds of letters not known 
Blends not known 


In like manner ch 


rase reading and comprehension) and in g 
reading habits. In silent readin 


heck list o 
- Low rate of silent reading 
High rate at the ех 


f mechanics as follows: 


ecessitates rereading. 


are provided for com 
1 


н in 
reading in speed, recall and security, 


e 
: 7 за raphs а? 
silently and then their content j "o сактана к: 


A small tachistoscope has been Constructed by means of whic 


А те 
word from a list may be €Xposed for a short length of time. : 
two parts in each of four lists, p. 


while Part 2 is to study the 4 


DEEST ators 
nally there js a phonetic inve? 


tog if 
x * : iff ties 
eck lists are included for analysis of difficult al 


h 
a 


1 


ote: 
8 the check list is unusually comP 


on? 
rÊ 


soD! 
jo 

017 o6 
art 1is meant to study flash recog on 


2 
^ nalysis of Words. АП incorrect Er s 
are recorded phonetically, Fi 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 117 
place for recording difficulties in spelling, difficulties in handwriting, 
and difficulties in written recall. 

Standards for the various parts of the test based on approximately 
1,000 children are furnished. The author regards the opportunity for 
observation of errors under standard conditions as much more important 
han the norms. The check list of errors was based on the errors dis- 
covered in the reading of 4,000 children brought to the clinic. The 
manual states, “The check list of errors will be found to include all of 
he Significant errors made by any child." 

. This test lacks a thoroughgoing study of its reliability. In some 
‘stances the norms are not entirely clear. For example, the norms for 
Written recall in silent reading are the same as those for oral recall. 
€ author suggests that these two norms are sufficient for rough 
‘nalysis. One student (Miles A. Tinker) thinks that the items con- 
oe eye movements are of dubious value. On the whole, though, the 
a agree that this instrument provides an especially helpful m- 
tument for diagnosing and recording specific difficulties in reading. 


Tests of Oral Reading 


lagno А . š 
| Se reading difficulties present a Oral Reading Test is defi- 
ey a diagnostic test. The tester makes his own records (1) in seconds, 
e each Paragraph, and (2 itten on each paragraph (see 
Б 108 for illustration). . — 
em, 18 best use of the test appears in the study and Class! ication of 
: Tors which are made. Ordinarily No. 1 of a set is given. Then after 
s 9 or three weeks, No. 2 is given. In the meantime attempts are 
ав to provide she sort of training which will produce improvement. 
i ever, records of errors made in informal reading are also kept and 
E to al errors entered on the «individual record sheet (Fig. 11). : 
Р : Is clear from the study of this sheet that а satisfactory analysis 
rece, S Mechanics of oral reading сап be made. There is, a no 
im d of reliability in the manual. In a diagnostic test Ses sa 
to di. ап{ as in other tests. The test’s only major weakness is а failure 
“СК the comprehension i 


Yn n any Way- 
Table 7 appears а partia 


1 list of diagnostic reading tests. 


SPELLING 


T . 
Ns Outcomes of the teaching of spelling ha 
been 8 the last half century. The number o i : 

teatly reduced. Instead of the vast num 


ave been clearly defined 
ords to be learned have 
f words, both usual 


118 PROBLEMS OF MEASUREMENT 


INDIVIDUAL RECORD SHEET 
Progressive Analysis of Errors in Oral Reading 


Ай Grade— 
Pupil's Name © "TEL 
Type EH No.1] Daily |. байр pes] Daly о Daly 7 
I. Іхргутролі, Worps ——|—_| == 
1. Non recognition. .... аера 
2. Gross mispronunciation [eire pr = 
3. Partial mispronunciatio r-—l-1—-—1(1—L- 
—| 2—79 
—-— 2 
— || 
E 
ELI 
a | a 
al e ae | ишы d —| 
=== 
n Zonee 
Initial. er —— E 
4. Vowel digraph Miadi => 
im 
5. Pronounce silent letters. m 29 
9. Insert letters.. ....... Hza 
4 Pronounce packwards. 1-6 
a inge letters............0.0000 00, 
b. Polysyllabie Words EE 
TEN. Сет Are E ДАШ ——-—-—-— 
2. Syllabication. IL——ÓIT7]cr-— = Ба 
3. Omit syllable ls 
4. Insert syllable. . 12—168 
5. Rearrange lette; зани рана ee Е 2 
6. Incorrect |) ee 
pronunciation Т р р as 
of a syllable Шек ширен БШ ЕНЕ С 
a. Omit final |_| es 
b. Slur з Е SS eae 
„(c Inarticulate vowe 168 
4. Enunciation}d. Inaccurate vowels иш Башаты эши — шл сс 
e. Inarticulate consonants. eel ugs ——] pe 2 
. Inaccurate consonants. . le 
. Entire word indistinct Siew ad pe al 
5, Substitutions/#- Meanings changed. . 0224 
b. Meanings unchanged == = 
6, Insertions[&- Meanings сйапдей... 166 
b. Meanings unchanged A E 
7. Omissions|?- Meanings changed... Dk el E 
b. — Z 2 
n p -a 
8. Other types of error. ренні = Z Ec 
П, Grovrs or Worns Ww 2 
1. Chan 2. Meaning changed... , р 
oe AG Ме sias ET IEEE 
. Add Е. М pL 
3. Omit one or more lines bs Mew ae есы К Е] E Se 
4. Insert two or more words{ ir H= pesi wee 
b расан раа mae A 
5. Omit two or more wordsí2- Ке] —|—_ 1 
6. Substitute two orfa, Meanings qa J e a Oa 
more words fè Meanings ture — |} E 
7. Repeet two or[?- To correct error... ү | pe — 
more words о Secure meani = 1—1 E 
8. Other types of error! . ee Eee а 
L— 
p 
A 
Standard Scores for the Grede: | 
Date of Each Test. ................. 0 


Public School Publishing Co: 


Fic. 11. Gray’s Oral Readin 


" 
15410 
& Test, individual record sheet. (By Ре" 
mpany, Bloomington, Til.) 


MEAS Э 
UREMENT OF READING, SPELLING, AND HANDWRITING 119 


Taste 7. DIAGNOSTIC READING TESTS 


Name of 
t " , 
est Grades Time to give Characteristics Publisher 
Durrell A 
nalysis of | 1- " A -— 
Reading DE. 6 50 min. See discussion in chap- World Book 
x culty ter Company 
àn Wage — 
Divorce D and | Division 1: Part I: 5 min. | Norms based on 30,000| Educational 
nostic БЫ, 4-5 No time lim- urban and 15,000 Test Bureau 
tion of Sil ina- | Division 2: it on Part II rural children 
Readin es 6-9 (45 min) and 
pad ili- Division 3: | Part Ш (60- 
Gro; 10-12 90 min.) 
y Oral | min. 
"Tests, Mn 1-8 Time varies. Sec discussion in chap-| Public School 
12,34, Depends on ter. Norms for rate Publishing 
reader. Com- and accuracy Company 
paratively 


short. 


Part 1:30 min. | Two forms for each di- | California 


Ing: 
aham-C] 
i ark Primary: 1-3 
y: 
vision. r Form 1 and| Test Bureau 


lay n n 
а Intermedi- (about). 
ests ate: 4-8 Part II: no Form 2 in grade 2— of South 
time limit— 94. Each part has а California. 
«Turn to next reliability within Book Depos- 
test when grade of .87-.95. itory 
90% are Primary: ability to 
through.” recognize word forms 
and likeness, and dif- 


ferences among 
words, both visual 

| andauditory stimuli. 
Intermediate: parts 
1 and 2; reliabilities 
vary (.82-.95); 
words similar and 
their opposites; au- 
ditory visual recog- 
nition, sentences and 
paragraph meanings; 
relevant and irrele- 
vant statements to 


Pthalm, be judged 
Ograph Any grade Varies Photographs the num- American 
ber of eye fixations, Optical 
Company 


refixations or regres- 
sive fixations, recog- 
nition span, rhythm, 
eye coordination, 
reading speed, and 
rhythm 


К. ee ы ===ч= = 


122 PROBLEMS OF MEASUREMENT 


Survey Spelling Scales of Test Batteries 


9 ате 

The spelling tests given as а part of an алаа ТШЕ 
usually composed of words carefully selected from availab e Ta 00 

The Stanford Achievement Test,! for example, is е т four 
words arranged in the order of their spelling difficulty. The es. 
words are “it,” “and,” “ten,” and “old »; the last four, са st fof 
“rabid,” “contemporaries,” and “dirigible.” The spelling l4 e 
each grade starts and ends at defined positions. The -— starts 
spells the first 40; the third grade, the first 50; and the fifth gra a. 1Ї 
at the twenty-first word and goes through the seventieth | iyin 
the grades, except the second, spell 50 words. The manner 0 E ip à 
is as follows: (1) the word is pronounced, (2) the word is presente ud 
sentence which determines its meaning, and (3) the word is pronou 
again and only then spelled. Two illustrations are: 
42. shed —We kee 
43. afraid—Don’t be afraid. This dog g 


п" ]lin 
Tests! uses 75 words in its SP* e 


en 
and “glad”; the last three are “deter rest 
As in the Stanford Achievemen 


H lac 
Pupils of each grade start ata different place and end at a defined P 


Examples are: 
1. Grade 5 starts at the first word and spells through No. 50. 
2. Grade 6 starts at the sixth word and spells through No. 55: oug” 
E oe 8 starts at the twenty-sixth word and spells thr 
О, 75, 


The method о 


I 
мел nfo 
i Í presentation is also the same as that of the Sta 
Achievement 


Test. Examples are: 
26. toward—He + 


the spelling test in the attery. For example, in the advanced 
the 30 words begin with " grocery," “doubt » and “concert” 22° „дй 
with “souvenir,” “inflammabl 4 
utilized consists of 


ing jt 
‹ first Pronouncing the word, then presen " 
word in а sentence meaning, and finally pronou? 
again before it is spelled, 


! Items by permission of World Book Company, Yonkers, N.Y. 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 123 


Other achievement batteries usually contain spelling scales com- 
ded af lists of carefully selected words, either to be spelled aíter 
а and definition or else embedded in sentences which are dic- 
ES and copied. In tests of spelling suitable for the high school or 

€ge, sometimes the correct spelling of a word appears among three 


9r four misspellings of the same word. 


Separate Spelling Tests 


Spelling tests unconnected with test batteries usually include many 
More words to be spelled. It is thus possible to select words of about 
“qual difficulty and combine them into several sets or tests. It might 
even be desirable to have a test each month and to graph the improve- 
Ment or lack of it which obtains from one month to the next. Most of 
the words which are necessary in ordinary communication could be 
‘cluded in these monthly tests. Three tests will be described here: 
ч the Ayres Spelling Tests, (2) the Iowa Spelling Tests (Ashbaugh), 

Ч (3) the Morrison-McCall Spelling Scale. 

" he Ayres Spelling Scale consists of 1,000 words most frequently 
sed in written discourse. They were selected from 368,000 running 
f ords written by 2,500 different persons. These words were selected 

a list which ae combined from four studies of words used in 


Wspapers i 
Was q y good literar 50 lists of 20 words each to children to 


УЫ these 1,000 words were spe 
n ren living in 84 cities in differ 
the average of 1,400 spellings was made of 


i ifliculty of each word for each grade wa ed. У 
сыйар rr щул Mer in 26 columns from А to Z. The scale 


ists of this 1.000 most useful words arranged on un sheet 
all fied bens of words at the ends and many toward the middle. 
€r each letter and just above the list of words аге а set of per- 
tages which indicate the percentages correct which were spelled by 


En nder the letter O are the percentages 27, 
$ ew por ean indication of the words correctly 


i2 84 ich are an rect 
э! бу a: a A In preparing a test from this list 
frg, 000 graded words, the best procedure is to select € 25 words 
i € column wis about 50 per cent of Corrent e au antte 
not |; TÉ the class varies greatly іп its ability to SP B is : те ve 
be Suflicient words in the appropriate column, some of the words may 

| Clecteq from the less difficult and some from the more difficult 
dya tes пу in Spelling, 
аы iy cama Y dan, IBS 


s determined. Words of 


Bulletin of the Division of 


124 PROBLEMS OF MEASUREMENT 


à а > 16 реї 
i issible to use words varying in difficulty from 19 s 
m : Sary y t one S.D. from the mean) ^ daga i нея s 
-— J т га tends to distribute the pupils’ spelling score 
kd am 
Ts pose the Ayres Spelling Scale becomes the Huc to 
Extension of the Ayres Spelling Scale. Buckingham added 505 w 
5 1,000. | 
ыз Iowa Spelling Scales, published їп 191 Да 
of е 1,000 used by Ayres. These words were found by eon 1 Ash- 
most frequently used in the written correspondence of adu s scales 
baugh,? the author of these scales, arranged the words in sev s deter" 
intended for use in grades 2 to 8. The difficulty of the words ите ting 
mined by 200 spellings of each word of the scale. The words cons ccord- 
each test and suitable for a certain grades are arranged in groups xi pe 
ing to difficulty. Here is an example for grade 5 with differen 
centages of words spelled in the standardization of the test. 


; d 
9, use 2,997 words instea! 


53 per cent 


54 per cent 55 per cent 59 per cent 
advertised affair accept alfalfa 
article awful advancement channel 
assist considered advertise connected Я 
automatic corrected agreeable contemplate 
carrying correction attended decided 
(out of 23) (out of 29) (out of 26) (out of 31) 


You will note that the words within the 


f 

ette 

ose whose difficulty approximates 50 per cent " use 

results are obtained when the difficulty approaches 50 per cent b the 
such difficulty offers Opportunity for 


poor and the excellent 
The Iowa Spelling S 
1. It contains 2,997 


spellers. 


е: 
. 1 anc 
cale has certain advantages of great import ет of 


well-graded words already arranged in ОГ 
spelling difficulty. 
2. The words are Socially highly useful, 
3. The words i 


1 
i 0 
1 Anderson, ор. cit. "T Е 
? Ashbaugh, E. J., The Iowa Spelling Scales, Bloomington, Ill. Pu 
Publishing Company, 1922. 


, 


ME 5 
ASUREMENT OF READING, SPELLING, 


15. d 
one—Has he done the work?—done. 


AND HANDWRITING 125 


39. y 
+ reference—} 
Men Te made reference to the lesson—reference. 


In А 
ue list the words are arranged fro 
Scale M e in difficulty. “АП the wor 
ныр, selected from Ayres’ Spellin 
equally oe Ayres’ Spelling Scale, in suc 
among de cult, and the words were required in 
е 5,000 most commonly used words as repor 


Word Book.” 


Norm ; 
age. It ѕ are furnished for each grade from 
is clear that this scale furnishes а 


m easy to hard but the eight 
ds in each list of this spelling 
g Scale and Buckingham's 
h a way as to make all lists 
addition to appear 
ted in Thorndike's 


2 to 9 as well as for each 
well-defined procedure for 


admini B 
istering words from the Ayres Scale. 


OTHER TESTS OF SPELLING 


1. Publ; 
Spelling e School Achievement Test in 
test word pedes 2-8. Four forms. 
Public School the Iowa Spelling Scales. 
ү, поо ы 
pomington, agas Company, 
Spelling ge Standard Research Tests in 
ettoit аш 2-8. S. Н. Courtis, 
avis-Schrammel Spelling Tests 


Uses of Spelling Scales and 


Fi 
e кан foremost, spellin 
Шс can spell the 3,0 
Spell ¢ ion as well as the ау 
teac оу as many words 
tf e z and administrators aT 
кее a words of the Iowa S 
to s in spelling, then 
Spell all the words. I 


Esse: 
e 
Use. 
= 
ж Scales, since they аг 
Study, y and the results £r2P 
b each ji diyi 
gra in ividual сазе. 
on үр “РВ indicating the п 


e 
has het test, (2) on th 
Evidence clear an 


l 
Ma 
N. Gates n p. 7. Yonkers, N.Y.: 
ew pes L, and D. H. Russell, 
: Bureau of Publications: 


g tests C 
00 words so necessary 
erage of the 
as the norms 
e then satisfie 
pelling Sc 
the vast majority of 
t is here that suc 


e easy to use a 
hed not for pu 


Thus a chi 


kable as to 


. World Boo 
jagnos 


Teacher: 


Bureau of Educational Meas- 


all grades. 
Kansas State Teachers Col- 


urements, 
lege, Emporia, Kans. 

4. Unit Scales of Attainment in Spell- 
ing, grades 3- Test 
Bureau, Minneapolis, 

5. The Gates-Russe 
nosis Tests, 19312 


8. Educational 
Minn. 
ll Spelling Diag- 


Tests 

o make certain that 
for ordinary com- 
country. Do the children 
demand? If they do, some 
d. But this is not enough. 
ales represent the minimum 
children should be 
h scales have their greatest 


an be used t 


nd to score, can be given 


blic consumption but to 


ild can be easily taught to make a 


umber of wor l 
e second test, (3) on the third test, etc. He 


d unmista 


te 


k Company, 1923. 
tic and Remedial Spelling Manual. 
s College, Columbia University, 1937. 


26 PROBLEMS ОЕ MEASUREMENT 
1 


in 
i tly ! 

ncorrec 1 
hould also make a list of the Words he has spelled лү oret 
He shou rocedure constitutes a genuine i 5 analyzing 
ng scales arises in connection wit 


js 
lling ! 
there are found students eese i p spel 
ll tells of a pu il who was a aly?! 
poor indeed. Gates and Russe Бы a ot ий е an tici 
i à that might possibly influen thei? 
spelling deficiencies should first 


е 
3 oun’ © 
Wrongly entered. If a child pror us 


we have such at 
sis Tests,? whi 


А insertions, Omissions, 
errors, additions, ete., in their 


this 
Spelling. Моге precisely stated, th jsted 
furnishes a method of discover} 
below. From their investigatio ws: 
developed this Series of diagnostic tests, which they list as follo 
1. Spelling words orally 
2. Word pronunciation 
3. Giving letters for 
4. Spelling one sylla 
1 Ibid., pp. 30-31. 


2 Ibid 


letter Sounds 
ble (nonsense Syllables) 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 127 


> ү two syllables (nonsense syllables) 
p ord reversals 
E Spelling attack—method of study 
9. d discrimination 
E isual, auditory, kinesthetic, and combined study methods 
easily en; of these tests, which a teacher, with a little practice, can 
pa carn to give, analysis can be made of the sources of error and 
Des ng and practice directed to the areas where it will count most. 
79у cases, it is discovered that the pupil never has learned a good 
od of studying a word. For this reason, he must be taught how to 


learn to spell 
HANDWRITING 


A А 4 T 
5 à means of communicating with others, handwriting has lost 


Som, * 
ота in the last fifty years to typewriters and other sorts of 
"ding machines. However, in social communication, and in making 


pri «s : 
Su de notes it still maintains an important position. It is also the 
cipal means of helping the student clarify his own thoughts on any 


Opi Ue 
Pic, “Writing maketh an exact man.” | 
© genetic development of this complicated motor habit throws 
light on its complexity. In the early stages of learning, hand- 


E is largely a matter of perceptual motor learning. The child 
Tracing it by means of overlaid 


look, 
tis, 5 at the letter and then draws it. 
© paper or controlling the direction of movement by means of 
He must learn to draw what 


hol : 
he ding the child's hand is of little value. 
me goes on, the perceptual object 


Se i- 2 
or ie The model is before him. As ti 1 ‹ 
is odel is removed and learning becomes ideomotor. These simple, 
mpate habits must be integrated in such a way that the hand- 

mes smoothness and speed play 


ing ; х 
hop 18 is smooth and rapid. Someti c A 
i t in those respects 15 


Raj z 
ine s 
to an at the expense of quality- 
Еко луш the quality in mor 
atte, uu ару the greatest enemy 
2 А, n from form to substance a5 
For 2? and legibility, they improve 
Well th. S reason, quality and speed 0 
at they are almost self-running. 
IvES IN TEACHING HANDWRITING 
mineg aims and objectives of eaching of handwriting are deter- 
Stade, Y levels of attainment achieved by pupils in the appropriate 
by HC In the earlier grades norms for speed and quality are denned 
Ose levels of attainment that children under good instruction 


Some 
Writin 


e or less formal exercises. 

of quality, therefore, is the shift of 
one writes. If one’s thoughts shift 
but ideas and organization suffer. 
f handwriting must be learned so 


Aims AnD OBJECT 
the t 


128 PROBLEMS OF MEASUREMENT Р 
rds 
have succeeded in reaching. But in the upper grades adult pnt 
are the determining factors. Questions of how well employers ё Ed 
employees to write without being penalized in their work en 
the norms for achievement.! cimens 
One investigator (Koos) Studied the quality of 1,053 spe kand 
secured from social correspondence and 1,127 samples of the 


as quality 60 on the A yres Handu 


peed is closely ¢ t 

M rea 
could be pushed up to 80 or 90 letters per minute without £ 
affecting the quality. 


А second aim is to teach 
in handwritin 


‚ ёт writing properly on а Page. Here instruction in t ТШ 
of headings, margins, and Spacing seems most important. YOU jg 
see that one of the variables on the F 
spacing. 


d- 
han 

nt to write well whenever pretty 

in handwriting are due P іле 
€ pupil to realize the importance of W 

H H 1 om 

In short, а child who writes well enough for Satisfactory social € iag 

ац, who has learned & method of analyzing and impon А 

15 own andwriting, who arranges the wri ial properlY oc- 

page; and why g ritten material prop j 


) Sires to write well at all times has fulfilled the ? 
tives of handwriting in the elementary School, à 


cd: 
е and quality or handwriting have been meas iont 
These two variables are į nt. To a very considerable © py 
Subject. If the set is obtain’ itt 
rapidly as possible,” then ed Ја? 
ng the subject to “write as 
1 Koos, L. V., “The Determination of 


sin 
à : Ultimate Standards of Quality ! 
writing for the Public Schools,’ Elementary School Journal (1918) 18:422. 


antl 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 129 

бас, rate suffers. For example, one experimenter (Freeman, 1915) 

к sa when he called for quality, speed was reduced 3.7 per cent 

for a. my improved 6.2 per cent. On the other hand, when he called 

Pod s ; quality decreased 9.1 per cent but rate increased 27.2 per 

еа ue the instructions to subjects to obtain the best results 
e, “Write as well as you can and as rapidly as you can." 


Rate 
ecure reliable measures of rate provided 
or sj ken. When a subject is being measured 
dice rate of handwriting we must not confuse the issue by intro- 
к ng other variables. Satisfactory results are obtained if the child 
i tg the material by heart, if he can easily spell all the words, and 
* * words are not too long. It is customary to use the same sample 
both rate and quality. In this case, the material should be the same 


as 
that appearing in the rating scale. 
i ttysburg edition of the Ayres Scale, 
ч is Gettysburg Address. He would 


Iti А 
ü "i 18 comparatively easy to 5 
W simple precautions are ta 


en 9uld copy on the board Lincoln 
he; Eo over it with the children, cal t 
i thoroughly acquainted with the 


ple. « 

15 а Four 4 score 9 and 12 seven ^/ 
€ number of letters written рег e "-— 

“re is consi le justification tor using simpler, interesting 
terial Ch uper Let Thus the American Handwriting Scale! 
10 ` Words chosen with the child's interests in mind: *t Anna 4 has 7 six 
21 feci 14 baby 18 kittens 25. They 4 play 8 with 12 a 13 round 18 red 
are all 25.» The units are 25 letters long. The reliabilities of rate scores 


high indeed. 

Quality 
ssary to compare the subject’s sample 
se qualities, ranging from poor 
d. Such sets of graduated 
Handwriting of Children, 
(Gettysburg edition) 


is nece 


In 

me . "e 
of asurin lity it 
2 S bak f samples who 


Ndwrit: 
to exciting with a set 0 ^ 
Sam Шеп, have already been determin 


(2) Bles are (1) the Thorndike Scale for Hi 
and (5 Ayres Measuring Scale for Handwriting 
(3) the Conard Manuscript Writing Standards. 
1 
x itt — hi : A. №. Pal 
C, 193) Paul V., American Handwriting Scale, grades 2-8. Chicago Palmer 


130 PROBLEMS OF MEASUREMENT 


ished 
The Thorndike Scale for Han dwriting of Children! was first pue for 
in 1910. It was the first scientifically constructed an — 
educational measurement. This instrument consists of T n No. 18 
along a scale from No. 4, which was artificially constructed; e written 
which is a copybook model. All the rest of the samples w we y 
by children. The differences between samples were detern ve tbe 
sample to be better or wor! com- 
erit. If 75 per cent of equally vf 
to be better than another, thet for 
one unit better than the other. 


in W. 
a scale—4, 5, 6, 7, etc.—in У" 


een units up and down the scale are approxim? 
the same. E 
Some weaknesses have caused the Thorndike scale to be less 
than former 


HANDWRITING STANDARDS 226 
Grade 
зо cro 
ЭЛӨ Ко а | ы T oL 
H 80 
Speed, letters Per minute... . | 35 45 55 64 72 17 
Quality as measured on the 
Thorndike scale: 11 10 
анор О BOIS е оз sal зов 12.5 
Ванавара | 8.5| 9.3 | 19.1 10.8 | 11.4 | 12.0 
The Ayres Measuring Scale for H 


1 New York: Bureau of Publications. Te: 


Е. 
é achers College, Columbia Univer 
? Bloomington, Ш.: Public School Publis 


hing Company. 


й rad i 
for e Се 11 (Fig. 13). This scale is su 


MEASUREMEN 
NT OF READING, SPELLING, AND HANDWRITING 131 


functi 
onal and m jecti 
озне оге objective than ‘‘general merit." Th 
eich "d p. ey eile icy who kept careful records of UE Tm 
bye б - scale value of each sample was determi p 
age time used up by th i Acto 
dra iege p by the 10 assistants in its readi 
20 io 90 генз samples written in blue ink and se E E 
a? ne t. а critics have felt the need of a score lower than 20 ed 
e 90, but one can interpolate a 15 or a 95 without Mi oe 
g 


to 
© great an error. 


Rote 


за 38 42 46 50 54 58 62 66 


Quolity 
g Scale. (By permission of Department of Edu- 


rk.) 


Fig 
>. 12, 
сано, Norms, Ayres Handwritin 


n,R 
ussell Sage Foundation, New Yo 
The 
аз ae Measuring Scale for Handwriting, Gettysburg edition, 
уеп in E^ the most used of all the handwriting scales. The norms are 
"1g. 12 
* à 
me Conard Manuscript Writing Standards! was developed to 
тойо f teachers of the primary grades who 
here on their pupils to writing by u he manuscript method. 
is € two sets of scales: (1) for penci, (2) for pen. The pencil 
Composed of samples 1 to 12 selected from 5,000 samples of 


Scri UK 

Tipt writing. The samples Vary 
E з 

2, which is qu 


in quality 
ite satisfactory for students 


itable for grades 1 to 4. The scale 
third grade to adult level. The 
ts. All the rest were written by 


Dm 
. Tact: А 
in g, ^ cally illegible, to No. 1 


hing from 


|, Den 
ast as 10 samples, reac 
adul 


~~ two 
child, n Samples were written by 
1 1n grade 6 or below. 


It, "ew ү, 
Ms py "Ork: Bureau of Publications, 
Permission. 


e 


Teachers College, Columbia University 


132 PROBLEMS OF MEASUREMENT 


Dear N\iss Conard 


Tam glad thar ws cou 
our writing, 


10 


Sunday, M 


onday, Tuesday, We 
Thursday, Friday, Saturday, 


Sunday, Monday, Tuesday, W 


Thursday, Friday, Saturday, 


il) 
Fic. 13. Conard Manuscript Writing Standards Samples 3, 6, and 10 (P 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 133 


Use of Quality Scales 


m. sample which you wish t 
mo е on the scale, say 50 on t 
i Re of the sample. The papers are 
fact үе» It is thus possible to o 
diver which makes for accuracy. The score 0 
for a with the score on the first. This average c 
rated € paper. Best results of all are obtained by 
ed independently by three different persons. 


o rate is slid along until it matches a 
he Ayres scale. This 50 is put on the 
now shufiled and the same process 
tain two independent scores, a 
n the second rating is then 
onstitutes the score 
having each paper 


rmission of Department of 


g Scale. (By pe 

ew York.) 

d about the reliability of quality 

e’s 50 samples is recommended.’ 

ed independently and then compared 

perts (the true score). By means of 
an be achieved. 


Fig 

ee Sample 60, Ayres Handwritin 
tion, Russell Sage Foundation, N 

concerne 


F 
-Or those teachers truly ; 
Thorndik 


in ‘ : 
In ue practice in rating 
With I$ case each sample is ЭСОГ 
Michi pe Score agreed upon by ex 

Practice considerable gains 1 accuracy € 


Reg 
“onable Quality to Be Expected 
the 


and of a large 


field, of employers, 
he Ayres scale 


ti : 
S the opinion of experts !? 


hu à 
ber of the general run of people that quality 60 on the т 
Cole, horndike, E.L, “Teachers” Estimates of ина of Handwriting," Teachers 


€ Record (November, 1914) VoL 15, 


134 PROBLEMS OF MEASUREMENT 


Е , e 
written at the rate of about 70 letters per minute is a reasonable 
achievement in handwriting at the end of grade 6. Figure 14 sho 
sample 60 of the Ayres Handwriting Scale, — . 

DIAGNOSIS AND ANALYSIS OF HANDWRITING 


Progress in motor learning begins with a general understanding 
of the problem, proceeds by means of trial and error, with some limi 


Letter Formation 


< | Ax y as 


RH Bitna peti 


3, 2 3 
" t ойм, A ks cose T9 DÀ Болбой, others a 

жерий welled, OND nome pow kis he 

IG. 15. Freeman’s Chart for Dj i ; m 

(By permission of Houghton Miti Company, алдн, letter for 
. К x с 
tion of error through Suidance and Practice, and results in the establish 
ment of a certain level of achievement. If this level of achievers 
is low, further Progress is contingent upon the analysis of Рё 
and the direction of practice toward a much narrower function. ^ os 
analysis of habits may take place in handwriting. Three proced# 


atio?" 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 135 


Lo at diagnosis and improvement will be mentioned: (1) Freeman's 
: hart for Diagnosing Faults in Handwriting,! (2) Gray’s Score Card 
for Measuring Handwriting,” and (3) Freeman's Score Card of Defects 
in Handwriting.® 
" Freeman’s Chart for Diagnosing Faults in Handwriting divides 
andwriting into five separate traits: (1) uniformity of slant, (2) uni- 
(Б) ШУ of alignment, (3) quality of line, (4) letter formation, and 
Spacing. Each part or division appears on the sheet at three levels 
© Performance, which have the accompanying scores: (1) poor, a score 
of 1, (2) average, a score of 3, and (3) good, or excellent, a score of 5. 
ese five attributes of handwriting appearing at three levels of per- 
ormance are printed on one large page. Uniformity of slant is judged 
y drawing lines parallel and close to the long letters such as h, t, or b. 
i * Scoring of uniformity of alignment is facilitated by drawing parallel 
nes above and below the written line. A reading glass aids in judging 
© quality of line. The diagnosing of correct letter formation is aided 
is large number of little arrows pointing to poorly formed parts of 
s “rs. A sample of that part of the scale called Letter Formation is 
„own in Е ig. 15. Freeman recommends that one attribute be scored at a 
пае, fach independently of the others. By assuming the scores of an 
Vidua] in each of the five traits a total rank is obtained. Such a 
net While not completely diagnostic, does tend to focus the teacher's 
9ughts on special aspects of handwriting which need improving. 
in "ly also suggest that if one attribute is practiced at a time greater 


im, i 
Provement in handwriting may be attained. 


Score Cards 


Score cards attempt to describe in words and to arrange in a sort of 


А "n а M $ 
in ok list the elements which compose handwriting. Their best use is 
n giving a total score or rating. 


M dia : 

nosing difficulties rather than i e c ; 
арту Standard Score Card for Measuring Handwriting, which 
фаг in Fig. 16, not only lists the qualities to be studicd but weights 
Th. 30 that the total points of a perfect handwriting would be 100. 
26, °rmation of letters, 50 essential to legibility, is given a score of 


© largest wei - dis 

use ета e lee of Defects in Handwriting, which is to be 
"i Connection with his Chart for Diagnosing Faults in Hand- 
"^8, not only lists the defect but describes its most probable cause: 


1 
Bo, 
2p ston: є 
Blo. 92: Ho їйїп Сотрапу. . 
c Fa ineton, Ш т Public School Publishing гон р a Чаны SER 
е к: : iting. : Ho 
рауы. N., The Teaching of Handwriting. ghton Мі іп 


136 PROBLEMS OF MEASUREMENT 


Defect 


1. Too much slant—(1) Writing arm too near body 
(2) Thumb too stiff 
(3) Point of nib too far from fingers 
(4) Paper in wrong direction 
(5) Stroke in wrong direction 
West's Score Sheet for Dia. 
writing! and the Pressey Cha 
writing! are other instrument 


d 
gnosis of Defects in Samples of wd 
rt for Diagnosis of Illegibilities 1n Suc 
s used for diagnosis of handwriting: 


er 
алу Practice Tests in Handwriting,? (2) ers 
һа 


agnostic Practi 
mple sentence f 


handwriting defects, 
analyzing defects suc 


1 Bloomington, Ш. Public Sch 
; Ill.: ool ishi 
2 Yonkers, N.Y.: World Book c M 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 137 


Standard Score Card for Measuring Handwriting 
с. Мел GRAY 


Uniformity —— 
Too large 
Too small 


4. Alignment ........ эзе 18 [sees |а| 


5. Spacing of lines... 9 || 
Uniformity 
Too close 

ки O CERES CERTS Ce Е 


8. Spacing of words..++++++ u [ере 
Uniformity 
Too close 
Too far apart 


7. 


Spacing of letters. . ... 218 |... |е еее 
Uniformity 
Too close 
Too far apart 


8. Neatness Lus 13 |... |е ре 


Blotches 
Carelessness 


9. Formation of letters... 


General form «++++* 


Smoothness -+--+ 


Letters not closed... 


Parts omitted +... 


Parts added .---++** 
TOTAL SCORE .. 


£ Bloomington, 10. 


Scored by.- 


ist Со. o 
Distributed by the Public School publisting 


nip 


riting. (By permission.) 


X 2. r 
s Gray's Standard Score Card for Measuring Handw 


138 PROBLEMS OF MEASUREMENT 


: ing these 
etc. On the other side of the chart are exercises for correcting 
defects. 


Uses of Scales and Check Lists in Improving Handwriting 


of 
From the standpoint of the administrator there are several m be 
measurement of rate and quality in handwriting. сана nop 
made between the school grades of his system and the assemb — 
as well as between different school grades. Through ane! Pra writing 
he can obtain an over-all picture of the pupil progress in har 
in a single school or in his total school system. ality and 
The teacher finds in the scales and in the norms for oe A A 
rate of handwriting attainable goals of achievement. To arat of 
pupils the ability to write as well as 60 on the Ayres scale a m 
70 letters per minute defines objectively the aims of wx Greates 
can know what is expected of pupils in this school Euba diagnosti 
help of all, however, comes to the teacher in the form о ne d ese 
les such as Freeman's or in check lists of possible defec ^ o not 
EM ents define and narrow the problems of instruction bu means 
ide Mam. To be solved there are needed practice exercises ү condi- 
е въ pupils may practice at the point of error. Under such 
tions rapid improvement can be made in handwriting. dwritinÉ 
The pupil himself may find these scales and charts of han nd 


s pasted on a planed pine board Seif 
pupils may estimate directly ating 
of attainable goals is a stim ostie 
ld is taught to use the aara of 
lue is greatly increased. Inst prac 
at it generally," he learns em ate 
which are poorly formed. Рі 
of progress, too, depends upon himself—a condition which is in 
motivating influence, 

It thus becomes clear that hand 
used both to improve learning and 
products which the subjects have 


itself ^ 


pe 
à ay 
writing scales and check Не ИЙ, 
to aid in evaluating the һап 
produced. 


SUMMARY 
Tests of reading, spelling, and handwriting are describe 
chapter. ieven 
Reading tests in the areas of reading readiness, reading achie samP 
and reading diagnosis are presented. Tests of reading reada peg” 
those mental processes which are usually deemed necessary m ip 
ning the more formal aspects of reading. The levels of achieve 


ain OF 


ents 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 139 


ledge of ordinary facts and events, 
number information, etc., are tested. Achievement is measured by tests 
Which ask questions about’ vocabulary, the meaning of sentences and 
Paragraphs and sometimes about the uses of indexes for locating infor- 
mation. The tests attempt to parallel teaching procedures with their 
questions and to test for the fulfillment of the objectives for which the 
teachers are striving. Diagnostic tests attempt to discover those sources 
n Weakness which prevent the comprehension of the meaning of what 

read, 

Perhaps in no other subject are the aims and objectives more ade- 
Mately defined than in spelling. The criterion of curricular validity is 
Satisfied because the words to be tested are fairly well agreed upon. 

€ words, about 3,000 in number, were arrived at through analysis of 
correspondence, artielès, and books which well-known authors had 


Written, and through a consideration of children’s literature, news- 
ø the Bible. From these sources 


a ai ; 

z Pers, and the English classics, including 
ords needed by every one for ordinary correspondence were col- 
tests have been 


«ted and graded for difficulty. Spelling scales and (с : 
‘structed using these very words. The aims of teaching spelling are 
"n the most part realized when children can spell these words in addi- 
lon to the words used in their ordinary communication. Such tests 
De been shown to be usable in (1) checking the general spelling level 
(3), Class, (2) motivating the 


individual's spelling procedures, and 
analyzing the individual's m 


Vocabulary, language usage, know: 


isspelled words so that his accuracy in 
as also suggested that each 


“Pelling mi increased. It w 
Pupil Še ~h а par А technique for learning thoroughly the 
(tec Spelling of a word Achievement in handwriting 1s dependent 
bon rate ап d uality. It is easy to measure rate, for it 15 simply the 
кет of ees written per minute with material which is well known 
Q the subject and whose spelling difüculties have been removed. 
Шу is estimated by comparing the subject's sample of writing with 
itions in the quality scale have been 


ale divided handwriting 


Prey} 
у еуіоцер,, : cond type of sc t 
Into f sly determined. beeen quality of line, letter formation, 


lvi 

B o us br» Mo elements was presented at three levels: 
oe average and good. Check lists composed of words — of 
cts in win were also introduced. The hope жаз to mars teachers 
deg Pupils to "in exact points in handwriting vee ae were 
e ts. To obtain most satis results from suc п diagnoses, 
xercises must be provi at effective practice may be 
i i showed were in greatest 

nd Courtis-Shaw are examples of 


eamer & s 
X ards of achievement before each 


improvement. The 
cut stand 


Pra, В 
cti 
Se exercises with clear- 


140 


PROBLEMS OF MEASUREMENT 


ild 
learner. Practice thus becomes an individual matter, with eac 


progressing at his own rate. 


QUESTIONS AN 


I. READING 

1. Name and 
developmental tra 
readiness depends, 
2. In what res 


explain five or six 
its on which reading 


g readiness? 
Securea сору of the Lee-Clark Test 
of Reading Readiness and 

i Point with the Metr 
Which is Superior fo 
hand? Evidence? 


opolitan, 
т the purpose at 


used to discover th 
ing. Do you think th 


6. How are tests o 
times useful in di 


studying the problem of re 
ordinary school. 


II. SPELLING 
1. a. Describe the various Procedures 
used in deciding upon the words whose 


spelling was to be learned in the elemen- 
tary school. 


D EXERCISES 


hool 
b. What are the aims of the ad 
in spelling instruction? r in the 
2. a. What limitations aP iter 
spelling test in the usual test ba tes et 
b. How can the separate 
tid of these limitations? ation 0 
с. What method of mes 
words to be spelled is common 
test batteries? uses (0 
3. Describe and illustrate the И 
Which spelling tests may be put. in 
4. What advantages do Ул spelling 
Such a test as the Gales Ramet R a У 
iagnosis Tests over the usual ts rego", 
5. Explain why many studen pest ? 
the Iowa Spelling Scales as the " 
all Spelling tests. идел 
What is а good method for 5 ell ne 
Ог pupils to use in learning to 5P " 
Words? Explain in detail. uld ge. 
at range of difficulty em уге 
use in Constructing a test from 
Scale? Why? ould 
8. Describe the process you es o 
in checking the socia] usefulne k. 
Words included in а spelling boo 


se 


use 


he 


ПІ. HANDWRITING of 


es 
1. Secure 15 or 20 sampo the 
children’s writing of as muc can 8° 
Gettysburg Address as they a 
finished in 2 minutes. фа Ауе 
а. Rate the quality on place * 
Scale, on the Thorndike scale. ape 
Scale value on the back of each P 


MEASUREMENT OF READING, 


duni After the papers have been 
Sac ed, rate them again a second time, 
Тыз ng the score in front. Average the 
E mane If there is a wide discrep- 
of A etween the two scores in the case 
time. Y paper, rate it again а third 
о Erase your marks and get 
rate ih. member of your college class to 
bns em. The average mark for any 
tion PIA is probably the best indica- 
p true position on the scale. 
,"ecure а n у iag- 
nostic chart. copy of Freeman 5 diag 
4. Rate the papers on each element 
е chart. 
titer DiN the ratings of the five 
the M. .How does this total agree with 
A. fating of the same paper on the 
Yres scale? 
d. Analyze the difficulties of sev- 


of th 


SPELLING, AND HANDWRITING 141 


e. Apply the Gray Score Card to 
the same paper. 

3. a. Compare the efficiency of the 
Thorndike and Ayres scales in measur- 
ing samples of handwriting. 

b. Which is most useful? Why? 

4. Which do you think is the better 
procedure to get a satisfactory judgment 
about the quality of handwriting: (а) by 
the use of a general scale such as the 
Ayres, or (b) by scoring the paper on five 
elements and summing these scores? 


Explain. 
5. What is the relation between hand- 


writing rate and age? 

6. Describe the leading characteristics 
of standardized practice exercises in 
handwriting. Illustrate by referring to 
the specific exercises described in the 
text. 

7. To what uses can the instruments 
for measuring the quality of handwriting 
be put by (а) the administrator, (b) the 


Gral ра 
Pers and tes ach 
Paper, nd test the defects on € nen о ЖЕДЕП? 
BIBLIOGRAPHY 
Books I. READING Readiness. Boston: Houghton Mifflin 
Company, 1936. 


B 
rection S E. A.: The Prevention and Cor- 
I.» of Reading Difficulties. Evanston, 


Dio’ Peterson & Company, 1936. 
of URRELL Dowarp D.: Improvement 
nkers, 


pes Reading Abilities. Yo 
КМШ, Book Company; 1940. 
cadi; ES, A, L: The Improvement of 
em, dt A Program of Diagnostic and 
Th “dial Methods, 3d ed. New York: 
acmillan Company, 1 


in : Methods of Deter? 
Pup Readiness, е York: Bureau О 


bia y ations, Teachers College, Colum- 
Gra versity, 1939. 
EENE, Harry A, ALBERT N. 


OR 

ц. ENSEN, and J. RAYMOND GERBER- 
Ep,“ €asurement and Evaluation in the 
» Chap. 
«t Hand- 


Mang 
Ha Као & Co., Inc., 1942. 
Ison, М. Lucile: 


Children Who 


Monroe, MARION: 
University of 


Cannot Read. Chicago: 


Chicago Press, 1932. 
—— —— et al.: Remedial Reading. Bos- 


ton: Houghton Mifflin Company, 1937. 
Trecs, ERNEST W.: Tests and Meas- 
urements in the Improvement of Learn- 
ing, PP- 110-125, 159-165. Boston: 
Houghton Mifflin Company, 1939. 

TOWNSEND, AGATHA: * A Study of the 
Revised New Edition of the Iowa Silent 
Reading Tests," pp. 31-39, in 1944 Fall 
Testing Program in Independent Schools 
and Supplementary Studies, Educational 
Records Bulletin. New York: Educa- 
tional Records Bureau, 1945. 

Wess, L. W., and ANNA MARKT 
SHOTWELL: T! esting in the Elementary 
School, Chaps. VIII-IX, "Spelling," 
Chap. 11, “Handwriting,” pp. 231-254. 
New York: Rinehart & Company, Inc., 


1939. 


F 
142 PROBLEMS О 


Articles 


Gray, WILLIAN S.: “ Reading,” Ency- 
clopedia of Educational Research (Walter 
S. Monroe, ed.), pp. 891-926. Also 
" Reading—II. Physiology and Psy- 
chology of Reading," rey. ed., рр. 972— 
1005. New York: The Macmillan Com- 
pany, 1941 and 1950, 


Journal 
30:399-405. 
ё Murray, Wirus W, CLARK, 
and Doris May Ler: “Measuring 
Reading Readiness,” Elementary School 
Journal (1934) 34:656-666, 

STONE, CLARENCE R.: “Validity of 
Tests in Beginning Reading,» Elemen- 
tary School Journal (1943) 43 361-365, 

P А., an 
“Preventing Reading Dig, 
Reading Readiness F. 


RIGHTSTONE, J. Wayne: “Diagnos- 

ing Reading Skills and Abiliti 
lementary Schools,” 

Method (1937) 16:248-954. 


I. SPELLING 


AYRES, L. P.- 
in Spelling, Bulleti 
Education, New 
Foundation, 1915, 


BREED, Е. S.: How to Teach Spelling, 
Dansville, N.Y.: F. - Owen Publishing 
Company, 1930, 

Broom, M. E.: Educational Measure. 
ments in the Elementary School, & 
ing," pp. 95-98, 158-172, “Handwrit. 
ing,” pp. 147-158, New ү, 
Hill Book Company, Inc., 1939, 


easurement of Ability 
П of the Division of 
York: Russel] Sage 


MEASUREMENT 


А ell- 
Davis, G.: “Remedial Work um 
ing," Elementary School Journa 
27:615-626. m- 
[e] R. C.: Six der z 
mon English Words. Niagara Falls, 
1911. 


chology 

Foray, Tuomas G.: The P а 
and Teaching of Spelling. Was ity 
+ The Catholic Univers 

America Press, 1934, p. Н: 

Gates, A. І, and A en 

iagnostic and Remedial publica 

Manual. New York: Bureau E ia Unt 

tions, Teachers College, Columb’ 


versity, 1937. ng the 
Huprern, GERTRUDE: po ee. 1 
Three gs. Minneapolis: E 


Publishers, Inc., 1936. "EIL 
ORN, ERNEST: A Basic 4 Conr 
Vocabulary. 10,000 Words M. um js in 
monly Used in Writing. Моло ай 
“ducation, First Series, No. 4. 1 
sity of Iowa, 1926. à 
bcm "Principles of Methor 
Teaching Spelling as Derive 
Scientific 


earbook 


Study of Ў 
Publi “School Publishing Сотр 
1919, 


Шш! 
пу, 


TE) 
a, “Spelling,” Eneyclopet ow 
7 Research, pp. 166-1 1941: 
ork: The Macmillan Company te 
Jones, үү, Frankin: Canoren spell 
tigations of the Material of Englis on the 
ing with Conclusions Bearing y ermil- 
Teaching Spelling. рако‘ 
a University of South of 
logy 
Pvrz, WirLLAM H.: The Psychol. 
the Common Branches, Chap. Incof 
altimore: Warwick and York , 
porated, 1939. cacher” 
HORNDIKE, E, L.: The Touren" 
Word Book, теу. ed. New York: e d 
of Publications, Teachers College: E 
umbia Universit 1931. йи! 
Тумау W, F. Survey of th viet ri 
Vocabularies of Public School Oe ps 
Connecticut, Teachers та 
"9. Bureau of Education, 1921. 


MEASUREMENT OF READING, SPELLING, AND HANDWRITING 


ПІ. HANDWRITING 
aos L. P.: A Scale for Measuring 
а of Handwriting of Adults, 
Ne ell Sage Foundation Pamphlets. 
i W York: Russell Sage Foundation, 

15, 
ушднАр, Еюштн U.: “Manuscript 
Re ng Standards,” Teachers College 

(1929) 30:669-80. 

"REEMAN, Е. N.: “Handwriting,” En- 
sss дайа of Educational Research, pp. 
Cor 1. New York: The Macmillan 

прапу, 1941, 

D and М. L. Daucnerty: How 
fog each Handwriting. Boston: Hough- 
Mifflin Company, 1923. 

тА C. T.: A Score Card for the 
No urement of Handwriting, Bulletin 
lots? Austin: University of Texas, 


atten, Emery W.: Directions for the 
0f the Leamer Diagnostic Practice 


143 


Sentences in Handwriting. Bloomington, 
Ill.: Public School Publishing Company, 
1924. 

PnEssEY, S. L., and Pnzsszv, L. C.: 
The Pressey Chart for Diagnosis of 
Illegibilities in Handwriting. Bloom- 
ington, Ill: Public School Publishing 
Company, 1927. 

Teachers Manual, Courtis Practice 
Tests in Handwriting. Yonkers, N.Y.: 
World Book Company. 

TuorNDIKE, E. L.: “Teachers’ Esti- 
mates of Specimens of Handwriting,” 
Teachers College Record (1914) Vol. 15, 


No. 5. 

West, Раш V.: “Remedial and 
Follow-up Work,” Handwriting (Ele- 
ments of Diagnosis and Judgment of 
Handwriting), Bulletins No. 1 and No, 
2. Bloomington, Ш.: Public School Pub- 


lishing Company,’ 1926. 


CHAPTER 6 


Measurement of Language and Literature 


Smith, Dora V., “Diagnosis of Difficulties ; з А pias 
sis,” Thirty-fourth Yearbook of the Nati i English," “Educational zi 


MEASUREMENT OF LANGUAGE AND LITERATURE 145 


woul m H 
ds лехи the writing of incomplete sentences and would ensure 
ел E agreement of subject and predicate, the correct use of pro- 
е А. о teach all pupils ап 
Pun g and speaking through reading and co 
: oie for their age. 

tapit is teach all pupils and stu 

, how to organize them, and 


һе special objectives of oral sp 
; To attain a special facility with oral forms of speech. If speech 


habi 

i Suc 
xis are almost automatic, opportunity 18 given for speakers to think 
they are speaking. This refers especially to the correct use of 


d c and verbs. 
1 m attain the ability to prono 
To enunciate the words so tha 
Certaj у: learn how to emphasize particular wor 
In ideas will stand out. 
th Го learn to arrange the gathered material in an orderly manner 
tt the thought flows smoothly and clearly. 
€ special objectives of written language are as follows: 
and e 9 learn the mechanics of expression: punctuation, capitalization, 
^ Pelling, 
етед, develop an underst: 
tions of words in a sentence- 


т | 
Verb. ,9 master the elements of grammar: 
› tenses of verbs, pronouns, distinction between words, etc. 


To learn to become keenly sensitive to words and their usage. 

To develop a desire to express well and exactly ideas entertained. 
© attain proficiency in gathering materials, organizing them, 
w wig enting them in writing in an orderly convincing manner. This 
t include the taking of notes, outlining, and giving attention to 


ог s 
ma of presentation. . 
rstanding of the s 

ithin а paragra 


d students the elements of good taste in 
mmuning with the litera- 


dents how to collect materials on a 
how to present them effectively. 
eech are as follows: 


unce the words used. 
t they may be understood. 
ds and phrases so that 


anding of sentence structure and the 


agreement of subject and 


5, 


election, arrange- 
ph so that unity 


= develop a taste 
ж by studying the language 
attain some proficiency in 


g TESTS: ELEM 


of great writers. 
creative writing. 


ENTARY SCHOOL 


LANGUAG 
Th ORAL LANGUAGE 
The te ar ‘lable at the present tim 
e e no і 1 language ava р 
sind find a well-developed check list which 


Or was not even able to 


146 PROBLEMS OF MEASUREMENT 


was 
[ied to oral speech. One not too successful yc ү 
might be app ording equipment to arrange oral ae нег 
the use of I the present this procedure has not prove hes -— of 
scale, but y deben English Test? there is one section which 
Аи 'The directions of this section say: 
P 


orin as Ш 
ation 
Place a check mark in the parentheses nearest each Correct pronuncia d 
the samples. Give careful attention to the position of the accent mark. 


Four examples from this section are: 


7 recognize —— — 
11 salmon 
15 regular 
19 sagacious —— — 


rék' à niz 


( rék'óg niz 
sál' mün ( 

( 

( 


) 
== ) sám'ün 
— —— гёр’ ü lär ) 
) 


rég/ ŭl ar 
sa ga’ shüs 


5 хер, йа 
sü gásh' üs 


A 
SS 


; this 
à i 1 in 
There are 32 words to be pronounced, This test is one of ар seve? 
test and has no Separate scores or norms, Its score is added wi 8 
others to make a total for 


which decile norms are available for 818 
through 13. 


WRITTEN LANGUAGE 


ts 
spe? 
Tests for written language cover most of the more formal а T ven 
of language. Considerable attention to language tests has been & the 
by those wh 


tteries. A detailed description ut p: 
language tests of three well-known test batteries will be gv ita? 
i atteries contain: (1) MetroP Jow? 
(2) Stanford Achievement Tests? and (3 

Basic Skills, 

The Metropolitan Ach 


Achievement Test,* 
Every-pupil Tests of ish 


1 
: ievement Tests divides its tests of Bn nd 
into (1) language usage, (2) punctuation and capitalizatio fore 
(3) grammar. These language tests are M one sense diagnostic of tbe 
the construction of these tests careful studies had been made ke " 
most frequent and the most persistent errors that children er tesi 
language. For the most part these tests especially those wbic umet 
language usage, concentrate on checking these errors. It is 2° y wil 
that if children know these more difficult aspects of language emet! 
have little difficulty with the rest. In the Metropolitan Achie tbe 
Tests, language Usage tests 


jl 
eme" 
1 Netzer, Royal F., The Evaluation of a Technique for Measuring Impro” 
Oral Composition, doctoral dissertation, University of Iowa, 1937. 
2 World Book Company, Yonk i 


ИЩ ers, N.Y., 1923. Items by permission. 
3 Items by permission of World Book Company, Yonkers, N.Y. 


= 


MEASUREMENT OF LANGUAGE AND LITERATURE 147 


The form of the language-usage test may be illustrated with a 


few samples: 


Complete the sentence. 


4 The baby _ the milk from the bottle. 

3 This is the child w. was late. 

4 Cats keep clean by washing th —— selves with their tongues. 
5 


- Neither of the two boys w. willing to bring it. 

ЗУ do you think will win the prize? 

“Did he lie down?” “Yes, he I —— down on the bed and fell asleep." 
There are 38 such items at the elementary level and 46 at each of the 
'Pper levels. Here are some of the language usages sampled: "give? 
and “gave,” “drank” and “drunk,” “may” and “сап,” “taken” and 

took,” «sip» “sat” and “set,” “jie” and “lay”; past tenses and 
ime Participles of such words as “choose,” “begin,” s Brow; stay,” 
ea and “do”; pronouns; tneither . . . nor"; "those kinds"; 

"m many more, Such a large sample of language usage offers many 

Portunities for the study of the types of difficulty present. 

ginning with grades 4, 5, and 6 there is also a section on punetus- 

and capitalization. A sample sentence to be punctuated is: 


6. 


tion 


Take ; 
€ it away it is annoying те. 


А А 
рде of a sentence in which to enter capital letter 
arks is; 


s and punctuation 


mrs G 
Green is carols aunt. 


nt 
гаас tavanced battery (for grade TEAM 
uage contain tests of grammar as we 

агы tation and capitalization. In the test of grammar, tests are 
ang eed for types of sentences, for the number of words in the subject 
sho Predicate. and for recognizing several ue oe 
à i d to designate 51 7 b 

ang ,P^ragraph, also children are askec to € ‹ 
» iri. also paid to the selection 
oft ©mpound sentences. Some attention 1$ р Meme es 


rate j ich applies to th 
Опе principle among nine wh S ps “doesn’t,” etc. From 


t Owing: “nei . . nor r 
i escripti делет à that the more formal aspects of punctuation, 
de аве ios ү A ле e are covered. It is also quite 
be b at in ees? is for teaching purposes analysis of errors could 
died kness made evident. 
and 7 of weakne. ; 
m © Stant Special arer & Tests have also provided adequate 
ay aes rie of language. For example, in the inter- 
Н i f language usage. Difficulties 
Stone b 0 items О ЖЕТҮҮ. 
Udieq ate fers aS pave)” « doesn't" and “don’t, dia” 
: "ain't go 


s 7 and 8) the more formal aspects 
ests of language usage, 


structur 


148 PROBLEMS OF MEASUREMENT 


» and 
2 € e 
and “done,” “went” and “gone,” “eaten” and “ate, broke ple 
“broken,” “соте” and “came,” etc. This test offers a very a the 
coverage of the usual errors in language usage. In like man 


: 3 one 
advanced battery has 100 items of paired words in a sentence, 
which is correct. 


lay 
lie down and sleep. 


her? 
she? 


I wanted to 
Was it 


ү" they? 
How do you know it was them? 
There is ло attempt in this set 
tion, recognition of the C Basi’ 
Tests of Basic Skills occurs Test ©› ctu 
0 
à test 
spelling, and sentence sense. In i e 
ded sentences with no punctuation W trop?” 
n marks, quotation marks, and аро ch tbe 
ed. There is a separate section in wh п the 
punctuated but no capitals are used. ^" „ое 


u 
А ct 
g language usage, correct and incorre 

are paired: 


an 
There weren't "d more nuts, 


chose 
choosed the red one, 


n 


: the 
os test of sentence Sense asks the Student to place a cross 1л 
ox 


. . 
if it is a good Sentence; in the W box if it is not a good sent 

R W Then as the boys came back to their Seats 

о о 

R WwW You are lucky 

Oo m! 


E 149 
MEASUREMENT OF LANGUAGE AND LITERATUR 
| oordinated Scales 
are the California Achievement Tests and quibu e ше 
of Attainment. The former of these offers specia 
E ther well-known matters 
| ion to these га c s 
The a has called attention ае 
ч mise anii баш because 50 m d be er п 
administrative Йи Шоп. The limitation с es tem di 
tna ноши е е the results of tests. Test 
tate EIUS tome M Pe d ther type. In its results 
uer ae more ur pardus T V sse lt If we 
| : e i me 
disco бавари ви aterials to overco 
iscover the ME weak points and arrange m 


i facilitated. 
se errors, learning will be greatly 


ing 1 
Tests devoted entirely to assaying | 
i e 
ч ap aa ipud DET st ttery such as those described 
test battery By administering a test ba s 
g this volume it may become app 
that m 


ае. Tt may then be decided t 
Neede, d 


The Towa Language Abili 
elem, 


ble at two levels: (1) the 
he intermediate test, for 
est—A, B, and C—to be 
Bm, and Cm—to be scored by 


chines, 


(jh S elementary test has fi 


anguage usage, (4) capi 
| Complete items in each subtes 
Usage, 


: elling, (2) word meaning, 
ме pu ү (5) punctuation. There are 
aod 25 additional items on language 
tan 


ling from 
ing the correct spe і 
m proofreading technique. 


10 1 


А cogni 
to che Spelling test oa iscorrec 
"E eoi w 
Pellings, only on (3) already 


(3) separatly (4) separately 


(4) alreaddy 
; (1) allready (2) alreddy 
d 34 


tely 
(1) seperately (2) sepra 


‚ == 1234 


Н € 
TN rds selected are yn ее 
s Ulties found am 
m ffers five 
portance» (Elementary Test, Form A) o 
© Word-meaning subtest 


i elling 
seem to present persistent 5р 
b ds of high social frequency and 
ri 


1 


Comp tal for Interpreting, 
“ny, Yonkers, N.Y. 


his test by permission of World Book 
s from th 
p 4, Item: 


s 


150 PROBLEMS OF MEASUREMENT 


nd (2) 
choices for selecting the word which (1) means the same as, à 
means the opposite of the word to be defined. 


14. Fresh 1 new 2 frozen 


14 

3 clear 4 stale 5 cold ——— 
12345 

S i 


12345 


y 
Н $i . contrat. 
37. Counterfeit 1 genuine 2 new 3 false 4 peculiar 5 

37 


12345 


12345 
E 


rd 
< ct WO 
The language usage subtest consists of two parts: (1) E in the 
forms, and (2) faulty expressions. Correct word forms are tes 
usual way: 
12. The hat cost (1) two ( 


2) to dollars E14 12 
38. The cook (1) ringed ( 


2) rang the dinner bell 38 12 


Faulty expressions are tested by 25 Sentences which are judged errors 
or “bad.” In the sentences appear quite a variety of sentence 


60. The boys they went fishing — $0 Good Bad 
73. We can easy take two o 


г three in our car 73 ^ Good Bad 
The capitalization subtest may be illustrated by the followin£* 


« good" 


———l 1234N 
34N 
23. She felt better, just as dr. Brown said she would ___23 12 


А к erio“; 
The punctuation subtest checks the correct usage of the Ph j 
comma, question mark 

› 


2 3 4 
7. I hope miss Kelley will come with you 
3 


› apostrophe, quotation marks, and 
and 's, N 
? 
14. The members of the band in the meantime, took 14 | 
27. Dear Edith i am 
Thope you will come for a visit next week 27 pat 


2 сой 
+ 5 d 
'The Intermediate test of the Iowa La: ilities Test © vor 
ngu Abilities Б 
Structed much like the eus 


11. To have power of endurance 
1 queer 2 sensible 
12.945 


MEASUREMENT OF LANGUAGE AND LITERATURE 151 


20. To regard with strong approval 


l admire 2 cars 3 delight 4 dislike 5 advance . 20 


i24 12345 
46. To suffer pain, sorrow, or destructive force patiently 
l exempt 2 yield 3 annex 4 endure — 5 transport 46 
gn 345 12345 


о 

The Subtest on grammatical form recognition lists eight forms— 
(1) noun, (2) pronoun, (3) adjective, (4) adverb, (5) verb, (6) con- 
Junction, (7) preposition, and (8) interjection—which students are to 
Tecognize in 25 sentences. The additional subtest in this intermediate 
€st consists of 50 sentences some of which are complete; others, frag- 
mentary, The test is to mark them “S” if complete; “F” if fragmentary. 

Xamples are: “Birds flying,” “The coat hanging in the corner.” 


je uy, details of this test have been described to show how com- 
Pletely d. Norms of the test are in 


Brade е le ranks for grades 4.8, 5.8, 6.8, 7.8, 
ih -8, and 10.8. The care exercised in construction, the number and 
“riety of exercises, the extent of coverage of important details, and 


L time to administer (48 and 46 minutes) recommend this test for 
е in the 5 i Jementary school. 
t in the eleme y, М 
шн italization, and punctuation such 


er tests of language usage, сар! ; 
` € Kirby й нм dest, Wilson Language Error Tests and Briggs 
lish Form Test have been superseded by the newer tests which 
v 
€ been described in this text. 
DIAGNOSTIC TESTS 

‘Gs Spite of the most skillful instruction, errors will creep into the 
no! language, In the upper grades especially thése euros Become 
ѕоЧееаЫе to i ; teacher in the pupil's speech and in his written work. 
ше of these e тз may be detected in the general battery or even 
а ne? in th des ] battery for language abilities. At times, however, 

e special battery lysis and more complete 


. eed i d ana 

Is fi concentrate 5 : x 
Ча Osis Sl for а ише To meet such а need is the diagnostic 
est of language errors. 


atte, "Atisfactory diagnosis in all areas is impossible, вага 
nd Pted. In the area of language usage within the sen p 

P lagnostic Tests in Language is ани 
for g.'Seen’s Diagnostic Tests in Language, c mi 
Pons agnosis in three areas: (1) pug (2) verbs, 
р i There are two [om jh és of pronouns occur. The 
ыс art I, 120 different examples 0 g 


es Gregory Co. 


s opportunities 
and (3) varied 


152 PROBLEMS OF MEASUREMENT Р 
5 com- 
uses of objective or nominative forms in compound anda ro of 
plements of copulative verbs, of objects of prepositions, o errors col 
infinitives, of object of a transitive verb are tested. € dg wit 
nected with “us” and “we,” with demonstratives, wit 
singular or plural forms in indefinites are included. : incorrect 
Part II, contains 149 opportunities for using correct o n to 
verb forms arranged in the groups. Special attention is (we ax 
use of the past tense or past participle, to the errors arising in is udi 
present or past tense, to the use of the wrong verbs, and to t 
corruptions, etc. сетете 
The third section, varied Constructions, emphasizes the a£ on june” 
of person and number of verbs, errors in the use of adverbs, СОЛ) 


ns 
e 
ees ith 40! 
nouns, adjectives, and ends with 

5 Sentence errors. : A bilities are 
f errors. No norms or reliabi "mu 


and tota] pu 

Two othe 
advisedly, 
talization! 


ill 
T tests which use the term “diagnostic,” регар pi- 
аге the Leonard Diagnostic Test in Punctuation gt her’ 
and the Los Angeles Diagnostic Test in Language to b 
9 narrowly conceived and too inadequate in sampli? 
truly designated as “diagnostic.” 


LITERATURE recti” Г: 
The teaching of literature in the elementary School has as its ui h B 
the gaining of acquaintance 


with writing of good quality in WP!" es 
ideas are fittingly and someti ve ait з p 


im i d. It tu 
examples with which ^ beautifully эе нөн 


MEASUREMENT OF LANGUAGE AND LITERATURE 153 


d the identification of a character or quotation with the piece of 
eva "^ in which it occurred, and the matching of the characters 
Mtn A in a poem, story, or novel. In general, there has been no 
"anm to measure powers of discrimination which weigh the char- 
ailem m and find this poem or story better than that one. The 
Mio et at measuring have been confined mainly to the general test 

den s. The Stanford Achievement Test, Metropolitan Achievement 
cH and the Coordinated Scales of Attainment have well-developed 
нең 5 on literature. Of these, the Coordinated Scales of Attainment 
qon more emphasis on the facts contained in the story itself than 
follow author or on the conditions under which it was written. The 

ing items are from grade 6, Coordinated Scales of Attainment:! 


34, 
Who wandered about doing good, dressed in an old coffee sack, with a stew pan 


cocked over one ear? 
1 Johnny Appleseed 2 Ichabod 3 Pecos Bill 4 Mike Fink 5 
d Gareth. 
* in Eugene Fields’ poem, “The Duel," 
The characters ate each other up 2 ‘They fought their duel in a garden 
Я They sailed away in a beautiful boat 4 They were stolen away by kid- 
Appers 5 ‘They made up their quarrel and became good friends again. 


This ; 
5 15 an item from grade 7: 

as hanged for shooting a comrad 
2 "The Highwayman” 

5 “Jim Bludso.” 


47 
` г Poem about а soldier who w e as he slept is 
in Kis Ballad of John Silver” 3 “The Skeleton 
rmor? 4 “Danny Deever” 
breadth of reading more than intensity 
de 7 the test asks what is the story 
hich of five stories is about the 


d) pictures life in a family 


of s Series of tests emphasizes 
about ^w For example, in gra 
ife o he Baxter family in Florida, whic 
lvin à musician, and what story (of five liste 
8 in Maine, 
not Section on literature of the Metropolitan Achievement Tests 
ate CCS 7 and 8) includes 80 questions about stories and poems. Pupils 
Sked to fnish two well-known quotations, to identify quotations 
t to match characters, to 


* poem or story in which they occur, t 
h the story, to recognize the country in 
y took place, and to answer many other 


Quest; 16 events of the stor М 
Ty, ons about the contents and characters of stories and poems. 
ke this clear. In the form usually of 


0 
io, °F three 1 { : 
u ee illustrations will ma 
n ў Ultiple choices, children аге asked what happened to both arrow 
Ng, for what was the Bell of Atri rung, who was the lawgiver of 
Who 2 in which of four stories W ue-haired fairy a character, and 
Was th Examples of items are: 
e ome. P. 
алей saved R Minneapolis, Minn. 


S by permission of Educational Test Bureau, 


Sd 
With 
ident: 

ntif : A 
which Y leading events wit 


1 Du 


154 PROBLEMS OF MEASUREMENT 
4 ill 
2. Doing nothing is doing 
C2 M 1 Quee 
7. At the * Mad Tea Party" the one who was sleeping was the ———— 
j 2 Hatter 3 March Hare 4 Dormouse CIT 


1 too much 2 plenty 3 good 


city boy 
17. “The Barefoot Boy” is a poem about 1 an orphan lad 2а 

3 alazy boy 4 a country boy C УЛЯ 2% 
38. “The King of the Golden River” tells how 


1 a cruel king putt 
3 a valley became beautiful as а баг 


boy discovered gold in a river 
Gluck killed a giant € 338 


cone 

An interesting exercise occurs in Items 56 to 63, which овоа т 
stitute a matching test. Ten selections are listed, after which cort 
eight quotations each of which must be identified with the hen he 
selection. For example, “Between the dark and the daylight W n on ? 
night is beginning to lower," “The breaking waves dashed big ched 
stern and rock-bound coast,” and six other quotations must be ™ 
with the proper selections. d anced 

The literature section of the Stanford Achievement Test, 2 arie 
battery, contains 50 items almost entirely of the informational М о thie 
Only three short choices аге given for each item, When turning is est 
test from those just considered there is a distinct feeling that t pre’ 
is more dependent upon memorized details than the others- 
examples are illustrative of this test: 


БЕ 
11. Buffalo Bill’s name was —— — 4 Kenton 5 Crockett 6 Cody 
11 456 y 
ЕН Н f 
H : Jibr? 
21. The Selfish Giant shut the children out of his 7 house 
9 garden 21 789 
PEH a 
38. Evangeline lived in 4 Acadia 5 Tuscany 6 Normandy 
38 456 
PR U 
2 
А consideration of th 


К Е ree on 
^ а € tests of literature present in the hes uct? 
teries makes it very cle. 


3 d 
еге is no test of the capacity to distingu!s i 165 р 

what is good and what is poor in literature, Nor are there spec jl d 

of the organization and devel: i oi? 


y 
1 ential appreciation appears at ae in b 
nor does any evaluation of one's Own creative product appe achieves 
test. On the other hand, considerable improvement has bee? copi pe 
over the older tests by introducing questions concerning the 


ei of 
of selections rather than by selecting questions as was onm est? 
custom from the names of authors and their works. Furthe 


155 


MEASUREMENT OF LANGUAGE AND LITERATURE 


āe a с сс кш E Е 


neang jso], jeuonvonp:;q Sui10u әртлгу S[9A9[ @ Кәлїпс 71-6 ‘8-1 aoe ae Аз... 'e1njviojrT ш juour 
-umjiy JO sajeg oná[euy рәўешрзоогу 
Aurdwoy xoog ром SULIOU әртлғу I AVAINS ZI-¢ "'se[v»g uonisoduro?) ysysuq uuem ULA 
uon 
Ауѕләл -зойшо2 
-N таштоо ‘38109 ѕләцотәј, Sur1ou әрелсу I AdAING [A9 "(onqei[) o[vog Ajunoj) nesseNr 
Хива uonisod 
-шоу Burysyqng үооцо$ Iqa шзлои opvic Y onsouserqy zl-L -шогу чч ut sjsa q, 9nsouSierp Kosso1q 
Kuveduro?) yoog ром. 5ш2оџ әртіс) € KoA1ng 71-6 * + (әләц әѕп Joogas yry) зә], ysu ss017) 
зшлоц 
nvoimg 352], eropp 93v рив apr * onsouSviq 6-8 ` "e3en3uv' ш 3s T, 2souSer sapasuy soq 
sa[nuoo1od uonezyeydeg pue 
Kuvduro) yoog pop | pue ѕшгоџ opv1o © эпзои8е(т 71-5 uonenjounq uti 39, onsouSvWq рїечоәт 
5ә1025 гу 0] PIIA sony Surpeos Juag jo uoyeu 
nvoing js2], |еиођтопру -U09 891025 MEY SJAJ € jeonqeuy | z1-01 ‘6-9 ‘+ | 1wexg onsousviqg, yeioaq-uouaey шед. 
so[nuoo1od 
{quaeainba əprı3 | шо ‘wg ‘wy 
Aurdwog yoog poy ©$әлоэз p1epuvig колра; ÁA9Aing Отор Meer eue "SIL soumiqy o3en2uvT emor 
Киейшогу yoog ром. ѕшлои әртіс) 1 AQAINS [a "'''se[e»s uonisoduio)) цец из uos[opng 
Sisoudvip 20] 
juvj1oduiun se 
рәріе8ә1 sunj0Nr 
'810119 JO SUID} 
70) K108210) ү ^) UI po100S SISIL, Z onsouseiq 8-© әЗепЗпе'т ш 515ә], 9nsouSerq пәәзитл] 
Кї1зләл S10119 JO UID 
-1N rquinjo;) *o3o[[07) s1oqovo], | xad ш so1095 apis) [4 Aaaing 6-L ttt 3sop шло ysysug sSsug 
Jogsuqnq 5ә1025 Jo sodÁT, boo e 353) Jo ршу зәрелс) 389] JO әшъдү 


TOOHOS ANVINSWTIS;-——SLS3], SNALVNGIDT] ANV ZOVAONYT AO ISIT 


A ee ee 


156 PROBLEMS OF MEASUREMENT 


; ol 
literature are described when this topic is treated at the high sch? 
level. 


LANGUAGE TESTS: SECONDARY SCHOOLS 
OBJECTIVES AT THE SECONDARY SCHOOL LEVEL 


The objectives in teachi 
of this chapter are continu 


ics within the written composition. The Ч 


| i 
ll the desirable Outcomes of English a pt 
th the aims of total education. Thus an lit 
x € 

includes such topics as ' арт. and 
for th criminate between paa ges " 
\ > or the pu i 8 i п 4 
creasingly adequate oe ү к : e ng - 

А о develop in. wn б, 
atters within their п jish 


t 

0° 

. e 10 

jectivity of some of {з ed 

tae о 1 

or the dramas of sp ciation of the poems 9" оу 
a A r i 

be satisfied unless hic Gan And what English teache jati”? 


dded joy when conte? 


p? 
be MTM | ut Y 
English teaching she be че Philosophies differ so widely ab? id, 


MEASUREMENT OF LANGUAGE AND LITERATURE 157 


P hilosophy. worships at the shrine of interest. He studies children's 
Interests, tries to understand them and to lead them into new and more 
realistic areas. Reading lists are made up of those materials that are 
n demand, Within limits, students make their own reading lists. In 
the light of such divergent philosophies there is little wonder that 
Materials of instruction vary broadly from course to course. And yet 

ere are some areas where measurement is satisfactory. 

When the whole subject of English is considered it is seen that it 
Huy Very easily be divided into two large areas. One of these areas has 
to do With the student's getting acquainted with great literature. To do 


this, he must learn how to read literary masterpieces with facility and 
s somewhat from ordinary 


aterial and its manner of expres- 
1 with facility and understanding the 


reader must be highly sensitive to figures of speech, to allegory and 


Ymbolism, to classical, mythological, and historical references, etc. 
; for its understanding and apprecia- 


To read literary materia 


oity, to ifficulties 

tion р”, 0, adds other difficulties, r m ы кир 
5 inc ‘standing something of rhyme and rhythm, 

of teased by understancing net, of blank verse, and of 


Poetic li the son 
cansion, nang Dae iue D ие have been mastered the full 
“tstanding кени Өз experiences vicariously, gains for one 

leg ciation and judgment of the best of what en am Ь un 
жын le student to an awareness of our literary heritage—its master- 

its its historicity. | = 

e esi езү spoon n of English. e pulli | to 
ess one’s ideas clearly and effectually both in a end I ; йш к 
со Marily, we think of grammar on the one hand an ү T an 
te, Position on the other. Grammar usually involves а 5 ү з > 
he Се structure, spelling, punctuation, and capitalization. At its best, 
in mar is ib cit "Hea. its facts are learned in connection with 
With Sentence аиан Сотроѕійоп апі rhetoric arin Era о 8 
Era, € expression of thought effectively in larger unt А e Р аг 
Or Ph and the essay or poem. It is thus largely à matter о organiza ion. 
тр lan y $ eaking involve the audience situation. 
rhe guage and public 8р differ somewhat from 


y» anner of expression аш 
ове ee used and the "Unfortunately, no satisfactory measures 
ave Written language- 


A ch. Measurements have been 
Const cen developed in the area of spee 


TUcted in the following areas: 


otre 


1, 
Language structure and usage 


anguage usage 
с «Apitalization and punc 
` Spelling 


tuation 


158 PROBLEMS OF MEASUREMENT 


d. Sentence structure 
€. Organization 
f. English composition : 
2. Literature and appreciation 4 
4. Reading comprehension and understanding 
b. Vocabulary | 
c. Literary appreciation, judgment, and acquaintance 


* 
LANGUAGE STRUCTURE AND USAGE 


Among these categories the tests 
ably the most satisfactory. The errors in 


in 
; in the correct forms of pronouns, and i 
of a few difficult words, 

Tests of English Usage 


И 
5 
«ей > һат ор 
Са] usage division of the Cooperative Mec 0 
contains 60 sentences 


ble 
entence underlined and numbered. The pro?” ond 


The grammati 
Expression Test: 
phrases in each s 
to recognize the e 
of the line, Some 


) 
( 
25. He was in despair; to who could he turn? ) 
1 2 3 4 ( 
33. The Puppy made itself a; home and calmly laid down near the йге. 
1 2 8 neces’ 
37. Accuracy of movement, like accuracy of words, are essential to the $ 4 ( ) 
1 2 
of magica] rites, à „ай 
Differences between "ready? ang "already," between «principal , ш 
"principle," аге tested. Errors {о be corrected in the test inv? опе 
use of pronouns as objects of Prepositions and verbs as well 2. t$ 
reference to antecedents, ee Pd 
In the Barret-Ryan-Schramme| English Test, two of the thr p. 
are concerned with language usage. In Part 1, Sentence диш К iu 
Diction, correct and incorrect Words or phrases in a running “of 
are underlined and the Subject must recognize whether the n i 
correct or not. Some of the errors present are the use O «offe 
"have," lack of parallel c 


f e 
t” p 
ОБ for most In the second раї, pe t 
test, Part II, Grammatical 


p! al 
» the subject must both recog tio” 


MEASUREMENT OF LANGUAGE AND LITERATURE 159 

error а : s А 

пс е P its grammatical rule. For example, in “It was left to 

and also ^um the subject must recognize both that “I” is an error 

нии at it is the object of “to.” The subject might even consider 
rong and give the reason for it, yet be mistaken on both counts. 


Agree 

m > d 

ent of verb with subject when phrases come between, proper 
b with what comes after “there,” 


e subjunctive, and the proper use 
e test contains. The reliability is 
89! when computed from com- 
e odds-even technique is used. 
74 with final English marks 


o 
a pronouns, agreement of ver 
of “m n the correct usage of th 
indicated * are samples of what th 
Parable f y a coefficient of .88 and . 
Furthermore. and .91 to .94 when th 
in the fi ore, this test correlates about 
rst semester of college. 


Capitalization and Punctuation 

ally consist of a sentence or paragraph 
her to indicate the correctness or incor- 
ctually to correct the error. In some 
that the manuals for the student 


Te > 

Whic Sts of capitalization usu 

rectne requires the subject eit 
Ss of the usage set forth ога 


Caseg 
and үе tests аге so constructed 
for the teacher contain a discussion of the grammar involved. 


Thi 

s с 

Uses i ocedure illustrates beautifully our contention that the major 

dificul tests inhere in their capacity {ог diagnoses and analysis of 
‚ culties, followed by the application of the laws of learning at the 


oj 
M of weakness. 

capitali, 5 Diagnostic Tests in English Composition include tests of 
5 M ал апа punctuation as well as of grammar and sentence 
Which те, The test of capitalization consists of 28 sentences from 
the s all capitalization has been € that of the first word in 
Sin tence, The subject must write in capitals where they are needed. 

rived at by means of a 


Nee : 
Study the sample of the usage of capitals was ar | ans 
ith which capitals are used in periodicals, 
tters, the discovery of errors in their usage 


]lustrations are: 


xcluded save 


o the baltic sea. 
are christmas and 


The Rh; 
The Rhine flows from the alps t 


i hildren’s favorite holidays 

many the scoring is done the errors of the children are analyzed. The 

trate al describes the principle of capitalization which has been illus- 

the J” the sentence. For ехатəр © in the second of the sentence above, 

Уеа Iciple of capitalizing the days of the week, the months of the 

i d holidays and church days is illustrated. 

the 5 careful analysis of specific errors contrasts rather sharply with 

English Test, Mechanics 


ie on capitalization in the Cooperative 
al of Directions, pp. 1-2- Yonkers, N.Y.: World Book Company. 


thanksgiving. 


160 PROBLEMS OF MEASUREMENT 


in the Pressey test, 


Punctuation, too, varies in the tests from 


its 

«ian of it 

a mere recognition z 
being correct or incorrect t 


15 


the 
А r 
unning story whethe he 


Properly belongs at that Rum; 
€ omission need not be properly cor emi 
he most common uses of the comma, de Б 
‘S, question marks, and periods are даг 
isfactory and is expressed in terms of sta? 


ame 
5 of spelling in high school are a oF 
ntary school (page 121). The р ай 
me cure facility in spelling those words or orta 


. r = 
ently used in Written Communication. It is also imp? ur 


MEASUREMENT OF LANGUAGE AND LITERATURE 161 


Tlie more automatic the spelling procedure, the more attention can be 
voted to the thought. Under such conditions the recognition of a 
p рева word when the written material is checked looms very large. 

Would seem then that presenting a misspelled word among others 
correctly spelled has at least the justification of its use in proofreading. 

One of the first serious attempts to construct а spelling test suitable 
ч high schools was Sixteen Spelling Scales.! The authors of this scale 
*cured 2,000 most frequently used words from previous studies and 
e their own experimentation and embodied samples of these into 
in peeling scales of 20 words each. The 2,000 words were submitted 
1 1505 of 100 to 46,017 pupils in 181 high schools to be spelled. Each 
"i: of 100 was spelled by 160 to 1,200 secondary school pupils. From 
à ese data was assembled a list of 2,000 words whose difficulty had been 
is determined, From this list 12 lists of 20 words each were 
‘ranged in such a manner that the first words in all 12 lists were of 
qual difficulty, as were the second, the third, etc. Lists XIII through 


D Were somewhat more difficult. Each of the 20 words of every test 
ad to the pupil and the 


Wi 
m first pronounced; then the sentence was re › 
ord to be spelled pronounced a second time. Norms (medians) were 
eacher to make 


Published for each grade and provision made for the t 
own test by selecting words whose difficulty was known. The 


s 
strength of such a test depends upon the care with which the words 
| “te selected. It is noted that while the У d to be spelled was embedded 
Бы Sentence it had attention called to it by pronouncing it the second 

©. The reliability was satisfactory for individual diagnosis provided 


Many as 10 vere used. Р 
By other - от у prepared for high school students is the 
mer High School Spelling Test, revised edition, for grades 7 to 12.* 

hi chool әр 64 40-word lists which may be 


Sis a 63. ‘let which contains 
aed for era epu It contains words from the 5,000 most 
qe determined by the Commonwealth investiga- 

i mmon word. From this 
been prepared for use in 


as 


. Mon] 

tio y used words k 
ge Every word in the test, therefore, I5 а eo 
hi As ist four scales of 100 words each have 

c 

=. m а larger list the problems of the 
eir difficulty arise. The number of 
testing. If the problem is 


st 
9 be used depends on t des, then a list of 25 will 


that of distinguishing bet 


E — 
a ц Чоп, Earl, F. L. Stetson, and Ella Wood 


tions PES and T. L. Kelly), Sixteen Spam EUN 
n 2 ix pa College, Columbia py eno Bixler High School Spelling Test. 
tlanta » Harold E., and Ernest 1940. 


› Ga.: Turner E. Smith Co. 


ween two gra 


yard (under the direction of 
New York: Bureau of Publica- 


162 PROBLEMS OF MEASUREMENT 


in 
be adequate. If, however, the question is to measure the spell? 
capacity of a single child, at least 100 words will be necessary. s 
Finally, attention may be called to Part III, Spelling, of the ко E. 
tive English Test, Mechanics of Expression. In this spelling te 
words were selected from the work of Horn and Ashbaugh. Each 


а e 1 spelle D 
appears misspelled with three other words that are correctly 
For example, Item 25 has! 


1. sanctioned 
2. receipted 
3. registrar 
4. parliment 
5. none wrong 


while Item 26 has! 


. treatise 

- accessible 

- vengeance 

- embarassing 
. none wrong 


л ® оо ы к 


Sentence Structure 
One of the most impor 


ability to write satisfactory Sentences. The best indication ? 074 

1 1 € H a i 
achievement appears m the composition. Next to turning aU 
sentence is the recognition of 


1 Items by permission of Educational Testing Service, Princeton, М.Ј. ` 


MEA 
SUREMENT OF LANGUAGE AND LITERATURE 16 
3 


had th 
e most to do wi i i 
| тони до with his choice. These choices are furni 
[| nd a higher level. The following example is FX A aca 
higher 


Of the f 
our sente f у X] 
n i 
nces below, which one js most е iectively ех pressed? 


1. As th 
e chi ү ү: О: 
ief was aw ay {тот home, we were хес! тей by his deput: dd 
y, a ruddy 


youn, H 
2. пане an infectious grin. 
comed us b ief’s deputy was а ruddy young man 
3. The о he was away from home. 
Bue. hief’s deputy welcomed us, а ruddy 
4. The и away from home. 
Chief was away from home and 


| Young 

] m i 

| AN an with an infectious grin. 

ich one 

of i i i 

the following considerations had the most to do with your choice of 
ice o. 


е best 
Sentence in the group above? 


with an infectious grin who wel- 
young man with an infectious grin 
А 


his deputy welcomed us and he was a ruddy 


LA 
: An adjecti 
2. Tit PI. clause may be clearer than an appositive. 
t madeclear what a pronoun refers to, thesentence may be ambiguo 
) us. 


» Su 
Ccessiv 
e clauses connected by “and” may be used when it is desired to gi 
give 


equal 
wei ; 
ight to various thought elements. 


. Ea 
ch verb should have a subject. 
dify the subject of a sentence. 


5. A нех 
purus is generally taken to mo! 
T en there are five items in which judgments are made between 
Sembling сев as to which one 15 better expressed. There are 10 items 
udents hee form the illustration just presented. In all cases of choice 
War choj ust check the consideration which led them to their partic- 
Principles e. If both levels of this test are used many of the most useful 
operar of sentence structure are tested. The reliability of the total 
0 ive English Test is above -95- In fact, the reliability of each 
95. 


e эму 
parts is in the neighborhood of . 


те 


Organization 


and : А 
clarity of expression. 
osition classes. 


directly in one o 
Effective: 
for the higher level. Three types of 


of organization. 
e of which does not belong 


tence has been separated 
st be rearranged in the correct orde 
ond type is from the lower level: P 
] Testing Service, Princeton, NJ. 


o 
mea 

9f the jm this outcome in 

Ooperative English Test, 


Mad 

ef 

а ог the lower level; the other, 
the testing 


ntences, on 
nd type à sen 


is 
| Th " ieee parts У 
“Them чү example of 
у permission of Educationa 


164 PROBLEMS OF MEASUREMENT 
A. The children grow awkward and ruddy 

В. because this is still London . 

C. they rush whooping along the cinder tracks 

D. not country children | 

E. between the ashbins and straggling flowers 

F. but not sharp-eyed, pallid Londoners either, 


I( 24 ) 
A. Cleaning the Turkey 
1.( 25 ) 
2. Removing Pinfeathers 
B.( 26 ) 
C. The Roasting Process 
1.( 27 ) 


2. Length of Roasting Time 

In filling in the incom 

you use for (24) the 
Stuffing 


1; 

2. Preparation for Roasting 
3. Degree of Heat to Use 
4. 
5. 


plete outline above, 


would 
" ics 
Which one of the following topic 

main heading, I? 


Size of Turkey 
Rinsing Inside of Turkey 


nm 

sag {Т0 
Inlike manne; each of the other blanks (25,26,27) has five topics der 
Which to Select. The process of Organizi 7 


r 
anizing materials into an B d 
easured by Setting subjects to recognize t = 
sentence does not belong with four others which are groupe t л 
the one idea, that there is a best Sequence in separated рат ation? 
sentence, and that a topic has certain Tecognizable internal re 
sometimes called coherence, 


English Р раё 
The need of more Precision in evaluating English composition” al? 
been felt for a long time, Tt was thought that possibly a rating gik? 
might achieve at least some of the precision desired, After E: equally 
had demonstrated in 1909 that the Cattell-Fullerton theorem " à nd 
often noticed differences could be applied to general merit n 
& Service, Princeton, N.J. 


C omposition 


1 By permission of Educational Testin, 


MEASUREMENT OF LANGUAGE AND LITERATURE 165 


writi : 
med T Milo B. Hillegas applied the same principle to con- 
position Т a РЕ for the Measurement of Quality in English Com- 
cumin. wd oung People which was published in 1912. This scale was 
Very роо of samples of compositions varying by known units from 
Bier x а уегу good. The known units were about one probable 
оте п x fact derived from the consideration that 75 per cent of 
d таг. judges chose one sample as being better than another. 
са] e of the scale simply slid a child's composition along the 
im rs its general merit equaled that of a sample on the scale; its 
trisha en, was that of the sample. This scale has been improved in 
the ni asm In the first place, the compositions which composed 
rabus e were on different topics, which made comparison difficult. 
оп the corrected this weakness by building the Nassau County Scale 
s а sii general principle but requiring that all the samples must 
bue en on the topic “What I Should Like to Do Next Saturday." 
in Soon recognized that children wrote better compositions when 
well Wrote of familiar experiences or on topics about which they were 
informed. Lack of information on а topic, therefore, produced 


po А ома 
t Огег quality in compositions. Hudelson's composition scale at- 
by furnishing the data from which 


e 
uam to overcome this difficulty à : 
ree] 1розііоп was to be constructed. The children simply had to 
tead Aldrich’s “A Snowball Fight on Slatter’s Hill” after it had been 
merit them. Some experimenters thought also that if this "general 
ined were broken down into smaller parts and those parts com- 
Van s more precise measurement could take place. Аз а consequence, 
! Wagenen constructed posed of scales for (1) ex- 
* : ноп. Each composition was to 
he three times: (1) once on (0; (2) once on struc- 
usi (s); and (3) once on mechanics (m) The combination was made by 
NE this formula: 
" AL4- 2s + 1m 
GM (general merit) = — 7 
t did not work out so well in practice 
additive. At any rate, the 


This - 
be... Procedure looked efficient bu 
igher than when only 


Cau s 
relia onc, the errors of rating were | 


Ben 

Bis, d merit was rated. Lewis narrov the r 
of . ng a scale made up of letters used in daily life—mail orders, letters 
© арр ication, social letters, etc. It emphasized the same principle of 


Structio Я 

n as the Hillegas scale. 

таро usage of the scales demands that the student go through a 
result rigid training in their use- Aíter such practice more reliable 
ts can be obtained in judging compositions. Generally speaking, 


166 PROBLEMS OF MEASUREMENT 


grade 5 and 4.2 for grade 6. 


th-grad* 
nth- 
Here is the composition which most nearly represents ad. 
achievement.! It is а trifle better than the average for gra 


ach man was well supplied 
Sparing with the ammunition, A 
within Tange of 


re 
г ot Paa 
One of the strong points of the Hudelson scale is the p лоб, 
judged exercises which the user can practice on. Опе сап em ishe 
i of these samples and compare his Tating with the es 


A p 
1 By permission of Public School Publishing Company, Bloomington, 


MEASUREMENT OF LANGUAGE AND LITERATURE 167 


Scal 
NS Constant errors of overrating or underrating can then be 
aped ability scale does offer 1 
Tem y the averages of the various gr 
EUR RE m toward which pupils may strive. When the compositions 
Pan look utely hopeless to the college-trained teacher of English he 
Bince ic 3 at the sample for his grade and be comforted. This goal 
it is within reach, may become a stronger motivating influence 


tha; P 
n one which approaches perfection. 


evels of attainment usually 
ades. It thus furnishes an 


READING COMPREHENSION AND UNDERSTANDING 


A My ; 
is Es factor which is related to English as well as to other subjects 
ing. For many years instruction in reading was left almost entirely 


to 
the elementary school. But when analyses of failures in both high 
it was discovered that poor reading was 


а with unknown words or because they had never been able to put 

ml of sentences together in their minds into à meaningful whole. 

de T and other reasons there 1s today in most good high schools a 
€ objective aimed at improving comprehension 1n reading. 


апу tests have been constructed to test reading for understand- 
o ask questions which might be 


in, 
&. Some of them have been content t 
r verbatim from the paragraph. 


an 
Swered by copying the correct answe 
ers, and these are the best, have included many questions which 


с 

im be answered from an understanding of the paragraph as à whole. 
io culty has been controlled by increasing the subtlety of the ques- 
ns and by increasing the complexity and vocabulary of the paragraph. 


One of these tests with forms suitable for both the junior high school 
per level) is the Cooperative 


1 

е level) and the senior high school (up. 

"паі Test, reading comprehension. The first part of this test con- 
h requires 25 


5 
'Sts of 60 words to be defined, but the second, whic 
ble for the junior 


d. Four or five questions are asked about 
akes clear the technique: 
suit very fine, and my 


Оп; 
е titute the material to be rea 


c . 
Paragraph. An illustration m 


e ptember 3rd (Lord’s Day)—UP; and put on my colored 
in Periwig, bought a good while since, but durst not wear; because the plague was 
*Stminister when I bought it; and it is a wonder what will be the fashion after 
th Plague is done, as to periwigs, will dare to buy any hair, for fear of 


ej 3 
x nfection, that it had been cu ds of people dead of the plague. To 


Y Permission of Educational Tes Princeton, N.J. 


for nobody 
t off the hea 
ting Service, 


168 PROBLEMS OF MEASUREMENT 
church, where a sorry dull parso; 


ith 
ny wl 
п, and so home and most excellent company 

Mr. Hill and discourse on music 


82. This passage is apparently taken from 
1. an essay 

2. a diary 

3. a novel 

4. a short story 

5. a sketch 

83. The writer had been afrai 


s 33( ) 
; Bue-stricken area 
84. The writer may best be i 


+ tenderhearted person 
- timid person 

+ music lover 

+ Practical person 

- Scoffer at religion 

he tone of this passage is 
- ironical 

+ Persuasive 
solemn 

- emotional 

+ Matter of fact 


Hur озю к 


34( ) 
85. 


л оо оҥ 


license, references to Gr 
much else is involve "— 
reading literature hag Certain Peculiarities of its суп For this Г est 
we have such tests as the 

which uses as its reading Material е] 
high literary value. One of the 
selections is the mood conve 


r 
а oet e 
ections from prose and po" 5 „з 


t : 
questions usually asked about * jx 
yed. F 


MEASUREMENT OF LANGUAGE AND LITERATURE 169 


lines 5 
cus to akont a half a page in length, form the materials for reading for 
standing. Here is a short selection with its questions:! 


The sky is low, the clouds are mean, 
A travelling flake of snow 

Across a barn or through a rut 
Debates if it will go- 


A narrow wind complains all day 
How someone treated him; 


Nature, like us, is sometimes caught 
] Д 


Without her diadem. 


Th 

е central thought of the poem is that 
10- 
" > Nature and people have more than one aspect 

"P Winter is depressing 

10.4 Winter comes upon us suddenly 
-4 'The wind is very tiresome 10 ) 


The 
he day described is 


"p invigorating 
113 depressing 
ii. frightening 
74 soothing 1( ) 


In 
5 
Ound the wind is 


12: 
i 1 howling 
2 hustling 
79 murmuring 
12.4 Whining 12( ) 


he 
last two lines suggest that 


]ways seem sublime 


13.1 
nature does not а 
ht unawares 


22 nature is sometimes caugl 
13. nature does not always rule supreme 

4 the night is occasionally starless 13( ) 
f-comprehension score, and 
of these "represents the 
as attempted to compre- 


ed: (D speed-o 
. The first 
individual h: 


Service, Princeton, N.J. 


Tw, 

(2) "| Scores may be obtain 

тод. of- comprehension score 
Uct of the rate at which an 


1 
B А 
У permission of Educational Testing 


170 PROBLEMS OF MEASUREMENT 


ù + The 
hend the test material and his Success in Wi жараш be. der- 
s d score “provides a measure of the ability of the ew: miliarity 
e the meaning of poetry and literary prose and of his 2 rms are 
vu literary devices and modes of expression."! Percentile — м 
КШ. both at the high school and college level. The менни. ; 
measurement or reliability uses the standard error of ved basis 0 
As for all other Cooperative tests, comparisons are eidcm is vet 
scaled scores. There are three forms of the test. 'The relia 
high. For Form 0 the reported coefficient is .97. TEM Towa Silent 
Probably the most used test of high school reading is the ed in this 
Reading Tests, advanced tests. It has already been descri to be rea 
text (page 111). Tt is mentioned here because the passages ‘ence 80 
are comparatively long and include poetry as well as tent c of 2 
government. It also has good questions on selecting the wem facts 
est and its test of abilities to loo 4 score? 
l reading functions. From its par 
е5 may be made, 1 tudent? 
eading tests suitable for high schoo p 7 
ds much on the problem being attac 


1. Nelson Denny Reading Test. 
Houghton Mifflin Company, Boston, 
2. Pressey Reading Tests, Ohio State 
Department of Education, Columbus, 
іо. 


The following a 


re a few r 
The one to be us 


ed depen 


ng 
. Bloom! 
School Publishing Company, B 

1 


Jes: 
ton, Ш. : scale 
5. Van Wagenen Reads ар) 
Educational Test Bureau, № 


jc 
Minn. с Diagnosis 
3. California Reading Tests, grades 6. Van Wagenen-Dvorak Ability. 
7-13. Intermediate, grades 7-9; ad- Examination of Silent gage es we 
Vanced, grades 9—13. California Test grades 6-12. Junior division, £ Edu 
Bureau, Los Angeles, Calif. Senior division, grades 10-12. afin 
4. Traxler Reading Tests. Public 


5, 
tional Test Bureau, Minneapoli 
Of this list the most diagnostic js the one by Van Wagenen. 


VOCABULARY 'ТЕ5т5 


m 

7 ат 

а part of many of the tests of English ore ect 
and usage. The knowledge of words and thei i Lc glee 
Stanford-Binet, i 


E 
ec. 
t Merrill Revision, in the ie 
Bellevue, and in the vast majori 


1 Manual. 


TURÉ 17i 
MEASUREMENT OF LANGUAGE AND LITERA 


i t! there is 

tive Vocabulary Tes 
i zords. In the Coopera Sx oen 
s Boe esa subject-matter fields. АП bes Ена 
Thornes Read s Word Book of Twenty Т, кше ed us 
i tie word er es палаи be pae. e furnished for public 
10 won eei defined. Percentile norms eme „ү с 
2 vx : ч елак ПО "t the South (11 grades 
у sc pe «Бе seconduty schools o E. 
а ү The t ag Ree eae t the more difficult 

pre lene Here are two samples a 
ity 1s Б 


leve] (Form у): 
22 


=. candor 

22-1 charm 
222 personality 
22-3 tact 

22-4 frankness 
22-5 logic 
chimerical 

27-1 fantastic 
27-2 doubtful 
27-3 temporary 
27-4 bell-like 
27-5 synthetic 


27. 


i English Vocabulary.* 
сінен we eec of the intelligent 
Another vocabulary tes sent a truly random SU dud iR 
bo. Cotes E. Experiments Es 150 words were used 
Benera] reader's vocabu MEAT test. If m den S Aire icd, n 
Ccessary to secure a re i increased. Т E 
© reliability was not grea А тет 
ind C— f which has à 
illus each o 


eliability 1 
NM ssisted (3) praised (4) ap- 
$ 2 (1) evicted (2) ж 
"opitiated them 
I еа (5) апдегей 
5 tered the document 


evened (5) published 


wro dis- 
te (2) read (3) recited (4) dis 
(1) 


i hool are: 
i the high sc 
d d knowledge suitable for 
ч ж dike Test o 
Hi The Thorn: foe 
' бу . Teachers Со Я 
Аъ ih Vocabulary Тев Ad Lid e d 
W T ollege Si A 
ing Саг blic Schoo! 
Mg Co; m Publi 
1 


f Word Knowl- 
Columbia Uni- 


M i N.J. Items 
p ting Service, Princeton, N.J 

н ducational Testi 

by operati e Test Division, E a 

y» tmission, vain. 

ү Item: 


n & Company, Boston. 


172 PROBLEMS OF MEASUREMENT 


LITERATURE AND ITS APPRECIATION к 
IO 
LITERARY JUDGMENT, ACQUAINTANCE, AND APPRECIAT 


This quality involves two 
distinguish between what 


:udginÉ 
rimination has been attempted by Уйсш” 
ur samples and which one the -— selecte 
he best Sample of poetry is usually e 


+ Scores on such а <i instructi? 
d memory. This phase of English ins 
ured up to the present. 


ing 
feel? 
The meaning of the Process of appreciation involves both 7 dei 
tone and a Judgment of value. This affective coloring which h 
to the judgment is aroused b 


they are aroused, " 

the objectives of appreciation, prepare items which test them, аб jnt? 
validate the items. He divide iati 
two parts: funda 


n 

ae up! 

on of rhythm, of meter, the gro адв! 
onamatopoeia). : the ut 
comprehension О js! 


0 
i j ish 

! Pooley, Robert, “Measuring the Appreciation of Literature,” Engl 
(High School Edition) (1935) 24:627-633. 


MEASUREMENT OF LANGUAGE AND LITERATURE 173 


edd of sounds is in the order of appreciation. The recognition of 
Ern ы. н, and space progression adds something to the fundamental 
S SN e ion. Secondary appreciative responses come to the individual 
d. ws of the means by which all the fundamental responses 
Puto. sed. More specifically, appreciation is enhanced when the sub- 
m ognizes the relationship between word order, sentence structure 
eis : content of the material, the appropriateness of the choice of 
him, о е content, and the figures of speech and when he identifies 
cem with the characters portrayed. 

"^ Tur aspects of appreciation have been emphasized by the staff of 
кы rogressive Education Association. Here are the headings of the 

acts and verbal responses which are illustrated with appropriate 


Subheads in the text.! 


1. Satisfaction in the thing appreciated. 

2. Desire for more of the thing appreciated. 
3. Desire to know more about the thing appre: 
4. Desire to express one's self creatively. 

5. Identification of one’s self with the thing appreciated. 

6. Desire to clarify one’s own thinking with regard to the life 


Problems raised by the thing appreciated. 


. 7. Desire to evaluate the thing appreciated. 

te There has been no measure of appreciation developed which at- 
тред to analyze out and then test the elements of which it is com- 

PUn, Most attempts at measurement have been content to offer an 

ebportunity to choose from selected poems or prose selections the best 

m and the worst one or to rank them in order. Samples of these 
tempts are now presented for both prose and poetry. . 

ao of the most interesting attempts to measure the ability to judge 

: i Exercises in Judging Poetry, was developed by 


ciated. 


неп with varying degr i 
Зб. *Read the poems A, B, C, D, trym 
M nd if read aloud. Write ‘Best’ on the 

ou like best as poetry. Write ‘Worst’ above t 


Here ; 
Is an example: 


he one you like least." 


Set 13. The Fog 


The fog comes 


ttle cat feet. 
et al., Appraising and Recording Student 


& Brothers, 1942. By permission. 
Teachers College, Columbia University, 


on li 


1 
Ss 
Progmith, Eugene R., Ralph W. Tyler, 
2865s, pp, 251—252. New York: Harper 
plications, 


New co Bureau of Pu 


174 PROBLEMS OF MEASUREMENT 


It sits looking 
over harbor and city 
on silent haunches 
and then moves оп. 
B( ) 


quiet as a cat. 

It comes creeping over 

the city 

and stays there quietly until the 
first thing you 

know it is gone. 


[C P ойлау. ) 
The fog is like a maltese cat, 
it is so gray and still, 

and like a cat it creeps 

about the city streets, 

How gray it is! How cat-like! 
Especially when 


it steals away, 
Just like a cat. 


D 
Who sends the fog 
50 still and gray? 
I fondly ask, 
And Echo answers, 
“E’en the same all 


“Seeing Eye 
that sends the Still, gray cat.” 1$ 
abov® А 
"There are altogether 13 Broups of four, of Which the sample 
the most difficult, 


mou 
: aris? 
In M. G. Rigg’, Measuring the Ability to Judge Poetry, comP 8 
is made between two i i 
are to be judged, The 


sor? 
Was done by 47 college profes 
of whom were professors of English, Two Samples follow:! 


ty 0 
EA iversit 
1 By permission of Bureau of Educational Research and Service, Un 
Iowa, 1942. 


MEASUREMENT OF LANGUAGE AND LITERATURE 175 


10 AC ) The night was still. You could not hear the howls 
Of any birds or any bats or owls. 
B( . ) You could not hear, I thought, the voice of any bird 
The shadowy cries of bats in dim twilight i 
31 Or cool voices of owls crying by night. 
A(_____) Who shall declare the joy of the running! 
Who shall tell of the pleasures of flight! 
B(____) Oh what a joy there is in running! 
And what pleasures there are in flight! 


The reliability coefficient of Form C with Form D is .72. 

Other tests of appreciation suitable for high school are: 
Tests. Turner E. Smith and Com- 
pany, Atlanta, Ga. 

Cooperative Literary Comprehen- 
sion and Appreciation Test. Coopera- 
tive Test Service, New York. 


€ Logasa and Wright Tests for 
Б precinon of Literature. Public 
School Publishing Company, Bloom- 
ington, rri. 

Cook-Bixler Literary Appreciation 
measure appreciation in the realm of 
Test by Herbert A. Carroll. Tests are 
high school, (2) the senior high school, 
imed for this test on two counts: the 
] and the procedure used in validating 
ded that four selections of differing 
each item. Each selection 


The outstanding attempt to 
үе is the Prose Appreciation 
b available for (1) the junior 

(3) college. Validity is cla 
anner of selecting the materia 
сш It was at first = gestae 

es of literary merit were to CO S х 

Was : first choices were selected from 
to be about 100 words in ЖОП Cather, Conrad). The second 


authors à Pe [йо 
A of the highest ability (Tolstoi, 

oles Were е еа Бик sem considered second-class (Harold Bell 
| Bailey). Third choices came from maga- 


` “ght 
Zines зае аар merit (Wild West Weekly, Love Story 
agazine, etc,), Fourth choices were deliberate mutilations of the first 
Choice, These кова p further validated by submitting them to 
© Voted 1) members of university English stafis, (2) critics 
upon by (1) m { English. Only items agreed 


tained. 
cum Me > bout 100 words each, 
a Гере і гу merit which each 
th four selections each at the junior high 
1, and 14 sets at the college 


Cont, E Э 
ао Thee ate вер es h school leve 
5 


level level, 12 at the senior hig 
È Ш instructions and the space for rating the selections appear as 
oll 

RS (college level) :' 


1 " . " 
B А Minneapolis, Minn. 
Y Permission of Educational Test Bu ] 


reau, 


PROBLEMS OF MEASUREMENT 


176 


IALA 


:ѕмоцој sv juouipní 1no& p1ooa1 рүпол\ пол ‘p 331049 YNOF put ‘g әотоцо pir ‘A 92rotp puodas ‘J 9194 931023 
351} MOA XX 395 uo jr 'oj[durexo 10,7 "1quinu jas 3811 əy} 1opun sioasuv INOA 3nd nos yey} ons og P1049 q11n0J MOA 931soddo 
р рие ‘oyo рл 1no& 931soddo ç ‘Proy puooos MoA o11soddo z *ooroq 3s1j 1no jo 19749] ay} o11s0ddo ү MSY I 3114 ‘peoa snf 
әлец поќ цом 39s Əy} jo Jaquinu əy} Zuneəq ‘моод иштоо əy} ш AON "(11097 әио ƏY} рив ‘pary auo əy} “әп[ел ш puooos ouo ayy 
*jurour Á1v19jI[ ISOWI IY} SEY Iopisuoo no& qorqa uorj2o[es aq әѕооцо uou] "A][hJo1e2 suoroo[os Jo Jas qova pray TSNOLLOGHI([ 


MEASUREMENT OF LANGUAGE AND LITERATURE 177 


ere is an illustration of the selections to be rated:! 


A MAN 
A 


a du ame to Africa, one might have said, without a face—with 
marks but , embryonic boyish countenance upon which life had left no 
р now, at twenty-six, his features were hardened and sharpened 
теч М, rather snub nose, the firm but sensual mouth, the blue 
matk ү ich a flame seemed to be forever burning. The fevers left their 
" bem к к were times when, dead with exhaustion, he had the look of 
hu in forty. Behind the burning eyes there was forming slowly a rest- 
ie oe intelligence, blended oddly of a heritage from the shrewd 
ona who was always right and of the lanky cleverness of a father he 
not remember. 

Di : 
=e Taylor was less than thirty. But he was a hundred years older 
ecilia in soul. He was handsome, brown-haired, tall, “taller than 
had already decided. He had laugh- 
h. He wore his evening clothes 


[ 
ie He was rich, he associated on 
Best member of a very young 

C 
girl would hope to meet. He was 


B 
tall eter was as handsome a fellow as а 
and broad shouldered, with eyes as blue as summer skies, hair black 


а : ; А 
taven’s wing, lips as red as red roses, teeth white as milk, skin brown 


a nut, wonderful hands, long legs, а wonderful nose, the best-looking 
wonderful voice. He was a prince 


ui 
а ia, Walked like a soldier, and had a 
ng men, that was what Peter was. He was so handsome he ought to 
© been in the moving pictures. Everybody knew this. 
D 


H 
Wa an only thirty and he was tall a 
abou, ДУ sophisticated and so г 
money, He bought his clothes in 


aut : 
Sy es nobiles in Italy ... His gray twee 
Were blue and just the shade of blue she most admired. His hands 


Te pi А а 
Уо, Че and brown and well shaped. His voice was the correct sort of 
and he smelt of good tobacco and a certain brand of eau de cologne. 


1 
Y permission of Educational Test Bureau, Minneapolis, Minn. 


nd as fair as Diana was dark; he 
ich that he never had to think 
London, his wines in France, his 
ds greatly became him. His 


178 PROBLEMS OF MEASUREMENT 


or 
Percentile norms based on 200 to 500 cases have been керий Г Г 
each of three levels. Probably the greatest weakness of the Eye da 
reliability which, whether computed by the split-half metho 
of test-retest, turns out to be .71. 


LITERARY ACQUAINTANCE 


Part 
Г. Pre-Renaissance and Е 
IL English and 
ПІ. Modern En 


ОТЕ i 30 


ju 
of 


те, 
nd Utopia. He is asked the 25^ bis 


€ sampling of 
or any other test makes. The questions in all three parts are © 
in the form of the following sample. 


27. The poet who most interested Amy Lowell was 
1. Shelley 
2. Swinburne 
3. Arnold 
4. Keats 
5. Byron 


179 


MEASUREMENT OF LANGUAGE AND LITERATURE 


BAOT JO Á1Is19ATU[) ‘DAIŞ pue 


Чотеәѕәу uoryvonp;p јо nvoing | 43u swəyı Jo 1oqumrnN e Kəamg © ` -Anod 23pn( оз Азціау oq Зшшпѕтәуү 33A 
AjISIOA 
-IUN VIquinjog foSo[[o7) SIYAL, | poziprepurys әл JON g AQAINS 71-6 РЫБ iz B “Anod 
Suidpnf utr sospuiox; onqeip pue Woqqy 
Kurdu 29 ише ѕшіои әртіс) е Кәлїп5 ZI-6.* Je "'*Áae[nqe20A ysysugq Jo ssa sugur 
poonápeuy 
Aurdwog ооң proAV sumou әрелсу (4 Кәлїї$ 71-6 Твор, AYA AremqeooA UESN 
921A19S ISI, 9Arjv12do07) swou o[nuo219q [4 Aang Sk. “ieee азәр, Алеүпчезод 2Anv19doo?) 
‘oD Burysyqng 100925 onqnq 5шлоп o[nuo21oq Y ÁeAinS |7101 От tertre SISIL SuIpRay J9[XV1T, 
so[nuos (sis&qeue 
neaing 3897, vrui0]iv7) | -od pue suriou optic е pur) Кәліпс̧ | еү-6 *6-1 [7777777707077] 8)s9 T Surpeasp ттилодүегу 
uorno 
-npg Jo 1usurivdo(q 23215 OO ѕшлои эро H Кәлїї$ [41 кк белее Зитреәзр Аәззәлд 
Auedwog upi uojqgnopg suriou әртіс) e Кәлїї5 91-6 Suet SESS ORL ysay, 3urpeos[ Auuo(g-uos|oN. 
IMAI ISIL, 9Ane19doo7) S3109S PILIS € Kaamg 7-01 SU EE) ATION 9A119d007) әчү, 
IMAI 353], 9An1odoo]) S3109S po[eoS Y KoAing п-п жы 111111 до1вчәцәлйшогу 
Surpeow I—II ysySuq, әлтүеләйоогу эчү, 
AysI9A aeg ojdooq Зипод ло} uorisoduro?) ysy3uq ut 
-1uf) erquin[o?) *235[[07) sya, suiiou әрт І ÁKo9Amg Ф Жуцеп{у jo yuowomsevayy 21) ло ILIS в®8әүүү 
Куѕләл (ріеќроом 
-їчгү ?rquin[o?) ‘әдәцод) SIIL ѕшлои әртіс 9T AQAINS 11-1, #067915 *uospopng) зәүео$ Burjads uooixig 
921A19g ISIL, 9A1]?12d007) | ѕә1005 91v1vdos Moy 77) Y AQAING [А Л т °довчәцәлйшогу Яшрвәу 77) 
Ajuo 2109s [V30T, g шоіѕѕәліх jo ss2U2ADD9H;[ "gp 
Җүчо 2100s [RIOT "V uorseidx;j Jo SIULI, "V 
ѕә1025 рәеос̧ 3591, ysysug aarviadoo; 
*02 чиш “у 1ouin | Ápo21100 poj[ods sp1o M Y AVAING ZI-6 *pastaay say, 2ur[odg pooyos YIH P1 
Кивішод ооң p[10AV suv1 apung n KoAng ZI-6 **7*3sap ysysuq [ourure1qog-uv&]-139110]g 
SUIIOJ JO 
qaysyqng 5ә1025 Jo spury задату 459} Jo pury | ѕәрелх) 459} Jo JWEN 


TOO0OHƏŞ HOIH—SISXI ячатунятгт ANV ZOVAONVT JO ISUT 


180 PROBLEMS OF MEASUREMENT 


iunior, 
The norms given on = of е freshmen, sophomore, J¥ Jar 
i the college level. icula 
Б ne ae po arises in niis tests of literature for a Paire 
school because the test items may be unfamiliar to к 5 studie 
Tastes of English teachers are so different that the se йм for the 
are widely varied. It is thus of great practical impor «d. for, 80% 
teacher to have a hand in the selection of the test to be us , 
all, a test must have curricular validity. 


Other tests of literary acquaintance are: gil 
ege 
1. Smith and Bixler. Awareness Test Kansas State Teachers Colles 3 
of Twentieth Century Literature. Turner poria, Kans. И = of Attainment 
E. Smith Co., Atlanta, Ga 3. Analytical Scales 0! 0-17. » 
2. Barrett-Ryan Literature Test. Literature, grades 7-8, SES 
Bureau of Educational Measurements, — tional Test Bureau, Minneal 


ac 


Min? 


SUMMARY 
The measurement of ob 


st 
l 
successful in the areas of 


п mo 
jectives of language teaching has sry school 
written language. For the elementan a an 
there are tests of language usage, punctuation, api. e m 
spelling. The study of the most common and most persiste? . , th 
of grammar has made it possible 
correct and incorre 
good or bad senten 


of speech are available, 


e 
jag 
г lang" ve 
In the high School, tests of the more formal aspects of апф б, 
Such as those of punctuation, spelling. capitalization, an 
usage are co 


ry 
У n measuring literary understanding 1 at 
preciation is to avoid superficial aspects of authorship 4? pray 
acquaintance and to test for the true 5 


ta 
ection, and (3) of a quo 


- Some items ask about a e © 
of a poem or story; others ask that well-known quotatio 


MEASUREMENT OF LANGUAGE AND LITERATURE 


181 


pleted; while still others ask that a character be recognized from his 


ns Elements of good taste in 
to di e test of language usage. There are 
bi qu PUN between good and poor poetry, nor are 

€ organization and developmen 


ls in the upper grades. 
i5 ia the high school, provision h 
tio ead and understand literary зе 

n, both of poetry and prose, 


t 
"s are not sufficiently well standard 
them. Tests for the organization an 
Y recognizing the proper sequence of sentence 


writing are measured by scores 
few or no tests of the ability 
there any tests 
t of thought except for scales of 


as been made for testing the capacity 
lections. Tests of literary discrimina- 
have been constructed, although these 


ized to warrant much confidence 
d development of the paragraph 
s seem promising. 


QUESTIONS AND EXERCISES 


l. а. Deseri ЖР 
а. Describe the objectives of lan- 
Etage instruction. i 
Seem: b. Which general test battery 
ы to you to measure language usage 
prelir с. Plan out a testing program as а 
ipo DU procedure for an attack 
i à genera f language 
improvertene l оре" ВРЕ 
ваф Why аге objectives in the 
Hoy ing of English so difficult to define? 
exp] does theory enter into your 
Pranation? 
"A Describe the subjective out- 
9f English instruction. 
b. Have they been well measured 
ndard tests? Explain. 
©. Describe the tests use 
erature in the elementary school. 
Usage Evaluate the tests © English 
bjectin, the high school level. Why are 
Wes relating to usage тоге easily 


Com, 


by sta, 
ing lit d in test- 


punctuation? Which ones are to be 
preferred? 

5. How does the measurement of 
ordinary reading differ from that of 
measuring literary selections including 
poetry? 

6. What difficulties appear in the 
measurement of organization? In your 
judgment how effective is the test of 
organization? 

7. On what principle was the first 
mposition scale constructed? 

8. What factors are to be overcome 
ng the understanding of 
tions which do not appear 
tanding of selections not of 


со 


in measuri 
literary selec 
in the unders 


а literary nature? 
9. Compare the vocabulary tests 


mentioned as to (a) manner of selecting 
the words, (b) the arrangement of words, 
and (c) the reliability of the different 


tests. | . 
10. What effect might a test of liter- 


; CaSured ue 

n or t 

udgmenty а" those of appreciatio ary acquaintance have on the teaching 

Meas What procedures are used in of literature? How can this danger be 
Suring capitalization, spelling, and avoided? 
BIBLIOGRAPHY 


Books 


Gn 
Jong pe Harry A. ALBERT N. 
cx EN) and J. RAYMOND GER 
the ж Measurement and Evaluation ™ 
ondary School, Chap. XIV. New 


York: Longmans, Green & Co., Inc., 


1943. 
HAWKES, Неввевт E., E. F. Lixp- 
d C. L. MANN: The Construc- 


QUIST, an Ч i 
Use of Achievement. Examina- 


tion and 


182 PROBLEMS ОЕ 
lions, Chap. VIII. Boston: Houghton 
Mifflin Company, 1936, 

Оре, C. W.: Education Measure- 
ment im High School, Chaps. IV, ү, 
New York: Appleton-Century-Crofts, 
Inc., 1930. 

REMMERS, Н. H., and N. L. GAGE: 
Educational Measurement and Evalua- 
tion, pp. 33, 214, 302-304, New York: 
Harper & Brothers, 1943, 

Ross, C. C.: M. Casurement in Today's 


Schools, рр. 46-49, New York: Prentice- 
Hall, Inc., 1947. 


SMITH, EUGENE R., Raren W, TYLER, 


etal: A bpraising and Recording Student 
Progress, pp. 246-276. New York: 
Harper & Brothers, 1942, 

SYMONDS, P. M.: Measurement in 
Secondary Education, Chap. v. New 
York: The Macmillan Company, 1927. 


TRAXLER, ARTHUR E.: Techniques of 
Guidance, PP. 78-81, New York: Harper 
& Brothers, 1945, 

Articles 


CARROLL, HERBERT A.: “A Method 


MEASUREMENT 


„ 
1айоп, 
of Measuring Prose Ado 
English Journal (1933) 22: Test of 
— ——: “А Standardized NS 
Prose Appreciation for Бота ботой 
School Pupils," Journal Vid 
Psychology (1932) 23: 401—4 тна JANE 
Locas, Намхан, and MAR Appr 
McCoy: “Tests for Megas) 33: 491 
ciation,” School Review (192 } 
492. “Measuring ш 
Poorzv, Ковент: Meast Еш) 
Appreciation of Literature, Дд] (193 
Journal (High School Editio 


e 
24: 627-633, ing th 
Rice, Mervin G.: s посне 
Ability to Judge Poetry, Scien 


of the Oklahoma Academy © die 

(1939) 19: 157-158. —— ;, of Diff 
Sarr, Dora V.: “piagno a ү 

culties in English,” “Educatio ! 


k O0 
nosis,” Thirty-fourth vx RA бно, 
National Society for the d “08 
tion, Chap. XIII, рр. 220- lish 


ub 
ington, Ill, Public School P 


Company, 1925. 


CHAPTER 7 
Measurement of the Social Sciences 


Measurement in the social sciences has been retarded because of 
art of curriculum makers to agree upon desired end 
hing, and because of the difficulty of 
als which are more and more being 
nce. The differences in the materials 


ОЁ ing 2 

proa у tion have been emphasized because of the two general ap- 
i to this problem. One of these, the older, divided the social 
civ; 8 into well-integrated parts: history, geography, economics, and 


lvic 
History , in turn, was divided into American, European, ancient, 
ch, the more recent, attempts to 


Mode 
hueleat and world. The second appro 4 
day. € all the social sciences around dynamic problems of the present 


a fa: 
оша on the p 
BE Social-studies teac 
Stated ‘Ing the achievement of go 

‚© In terms of social performa. 


In А 
Кез first of these approaches the pupil studied certain bodies of 
the *dge which had been watere om the college courses in 

d through research, were logically 


ent had fini Š 
nished a course 1n 

f our country, the French and Indian 

the Constitution, and 


facts, meaningful and 
th the other histories, 


the formation of 
had studied the 


$0 q » the Revolutionary War, 
d so it was wi 


Othe, УП to the present. He 
lon of American history. An 
he lcs, geography, civics, etc 9 
the € Second of these approaches kept its ef 
foc “perience of mankind ented in recorded history to be 
be So 1 Оп {һе problems of the 
Song ed or really understood 
NE lcs understood, or the nat Ј 
ut „5 арргоасһ, the focus was not on history аз Suc 


the ` the : blem. How co 
solution of the pde Zr AS knowledge of the history 


ш s] ems ; 
ыма of segregation 0 
т ПУ > the то problems involved, the fu d 
and the question of the geographical 

h is apt to omit much of 


ыыра with the problem, 
tion of races? This second approac 
183 


184 PROBLEMS OF MEASUREMENT 


f the 
; some 0 
recorded history, a great deal of economic qe a СБ helps 
intricacies of geography and to use only that mater 

blem. | of appro 
Ho ai these two somewhat contradictory methods 


М AS workin, 
the selection of objectives toward which the teacher 
doubly difficult. 


ach 
is 


ES 
SCIENC 
OBJECTIVES IN THE TEACHING OF THE SOCIAL 


field: 
zers in the 
The objectives selected from lists collected by workers 


: 5015 
ticis? 
We rom CI! of 

from what teachers say they are striving to do, and f 
of tests will necessarily i 


umber 
nclude a variety. Of the dt are BC” 
objectives available, there will be included only those 

erally agreed to. 

1. Information abou 


cts: 
«ful fa 

t social relations—functional, meaning 
2. Methods of acqui 


ring information—skills! 


4. Read to understand 
b. Engage in £roup discussion iala 
c. Listen attentively to oral presentation of materi 
d. Consult maps to locate specific information 
e. Recite in class in 
f. Read to locate information ific items 9 
5. Consult charts and diagrams to locate specie 
formation 
h. Make an outline or brief 
i. Give a special report or “floor talk”? 
j. Make a Summary or précis А 
k. Draw a map cific ite? 
1. Consult graphs and Statistical tables to locate spe пей" 
information ‘pits, p 
m. Observe pictures, Scenes, models, relics, exhi à "s 
boards, etc., to locate Specific items of das paie ^ pe 
^. Write an expository theme explaining trend or po 
cause-effect relationship 
0 


- Read for enjoyment 
р. Read to memor 


= : " 9 ding 
1ze—intensive reading and гегеа 
9. Take part in co 


asl 
mmittee work теу, Се? 
| р. апа, ©. Керле 
1 The 20 objectives listed were taken from Kelley, T. L., 
and Measuren 


i 
les 49 р pe 

M k: Char ^r 40 Тї 
rents in the Social Sciences, pp. 64-69 (New V oeperation et at (ae 
Y permission. Kelley and Krey gained А larger Es the ° 
school teachers in evaluating 52 items selected from a 


е! 
a ; st, arrang' 
present list contains the 20 items which were rated highe 

of their rating 


Sons, 1934) Ь 


MEASUREMENT OF THE SOCIAL SCIENCES 185 


r. Observe pictures, scenes, models, relics, exhibits, bulletin 
boards, etc., for general impression and emotional enjoyment 

s. Study maps to understand all the ideas they contain 

1. Draw a diagram or chart 


3. Evaluation of information 
a. The ability to judge an event in the light of the times in which 


it occurs 
b. The ability to w 
information 
с. The ability to comprehend caus 
d. The ability to distinguish bet 
material 
4. The development of attitudes 


4. The acquisition of desirab 
other races, other persons, 5 
toward the problems of huma 

b. The development of some ар 


volved in the everyday problems of living Mu 
C. The acquisition of interest in good government, fair prices, 
the problems of capital and labor, conditions of work, and the 


good life. 
sidered as а whole, we find present 
ing, judgment of the impor- 


Info k 
: „nation, skills and technique gemit interests, and atti 
, 2 


eigh evidence and to judge the sources of 


al relations 
ween relevant and irrelevant 


and interests 
le attitudes toward government, 
tandards of living and, in general, 


n relations 
preciation of the difficulties in- 


in the A part of these objectives is concern 

s -ization itself: taking part omn 

of Ng à report on some problem, reciting in class, or visiting a session 

in Court or the legislature. Another part has to do with collecting and 

- Preting a iot learning to use tables of contents, indexes of 

tts S, standard reference works, encyclopedias, newspapers, maps, 
atisti 

cal tables hs, etc. x . 

i dece ith j eciations, attitudes 

€ th : yith judgment, appr ons, udes, 

ad interea part, having to = w ]y from instruction but is an 

E dec sit f good teaching and learning. 


"ulation over the years 
THE MEASUREM 


It; p" 

heng 5 Apparent that these psa te 

Rasi s à 

ie asiest of all to measure i iments of information lava been 
Evel, 


ha ; 
Ps, i 1 А » 
R oa aay good measuring directed toward the meaning and sig- 
fica, ow Our in i oa uch. There is less emphasis, 
events and no 


immediate 
as a result o 
ENT OF OBJECTIVES 
differ in their ease of measure- 
rmation, and for this reason, 


d facts as 5 


186 PROBLEMS OF MEASUREMENT 


TES 
T IN THE SOCIAL STUD 
ELEMENTARY Ѕсноот, 

Test Batteries 


ust 

е for ай 

The Coordinated Scales of Attainment have tests designed ;ficaP. 
for each Brade.! For this 


facts in history 


53. Who was the President of t 
1. Wilson 2. Polk 


One can see in these illustration. the attempt to include 
questions. 


MEASUREMENT OF THE SOCIAL SCIENCES 187 


following two items illustrate the type of questions used at the upper 
levels. The first is from Battery 7: 


34. In the Northwest Ordinance, Congress set aside one section in each township 


for the support of 
1. churches 2. relief 3. roads and bridges 4. local government 


5. schools 
The other is from Battery 8: 


з. European nations were warned that they should keep out of American affairs 
by the i a 
l. Ostend Manifest 2. Monroe Doctrine 3. Kentucky and Virginia 
Resolutions 4. Hartford Convention 5. Wilmot Proviso 


The tests of geography suitable for grade 5 consist of 60 items. Eleven 
questions are based on the interpretation of two maps. There are ques- 
tions on crops, imports, climate, products of states or countries, animals, 
harbors, cities, etc. Here again is the attempt to make the questions 

Unctiona]. The reason that a certain sort of wheat is called winter 
Wheat, what latex is, how to calculate the shortest distance between 
WO cities on a map, how ocean-going vessels reach Washington—these 


ате typica] questions. Two illustrations are: 


49. A region in which the land, climate and vegetation are about the sameis called а 
1. unit region 2. mountain region 3. cultured region 4. farming 


3 region 5. natural region | lee un 
` The place in which iron is separated from the iron ore is called a 


3. blast furnace 4. still 5. purifier 


` reducer 2, separator 
each grade are furnished through 


Tests or hy suitable for r 
Brade g recs | Battery 7 (grade 7) ү ues Ке. ш ve 
Beogra, h T is а map of Australia together with six 

Australia. There is a map r a 
location frg un. there are 15 questions WEIT 
Its animals, its wool, the location of its population, etc. The re 


test h А ' 

as i Asia and Africa. . і . 
; ese pr G aphy which include many questions involving 
Wi eh ia of maps, constitute very satisfactory tests 


e Ё 
Whi otn of ite and а tion in the social studies. 


Y. reflect tcomes of instruc | 
The Re ue “Achievement Tests! first introduce tests X Ww Y 


Stud: s 

ie dies in the intermediate battery, designed for grades 5 аһа 6. Test z 
“beled “Social Studies: History and Civics,” and Test 8, «gor, 

ite dies. Geography.” The test of history and Civics is composed af Es 
Ms, about one-fourth of which are on civics and the rest on history 


x аэ 
Мета quoted from this test by permission of World Book Company, You 
à " "9 i3, 


188 PROBLEMS OF MEASUREMENT 


4 sked 0 
In the items which test achievement in civics, children hc y de- 
understand principles used in the selection of es aliens are: 
partment first deals with criminals, and who immigrants s on the Сім! 
The items on history include questions on the discoverers, Stephen 
War and Southern recovery, on such inventors as pes cs which 55 
and Edison, and on colonization. The history and civics | is intende 
contained in the advanced battery contains 51 items nos in on civic 
for grades 7 and 8, and the first half of grade 9. The quest tam o ait 
deal with problems of citizenship, immigrants, the regu p 43 que 
plane routes, and who usually does the picketing. There а ns such 
tions on history. Twelve of these questions are about perso әң Phere 
Theodore Roosevelt, Booker T. Washington, Cabot, and de first 
are questions about the Mexican War, Civil War, War of 
World War, and Revolutionary Wa 


i: 
Illustrations from the intermediate battery are: 


son: 


10. The purchase of Louisia; 
1. it didn't cost much 
United States complete co 
many Indians who wanted 

44. In the main, police powers 
І. states and local 
4. criminologists 


Па was important because: it gave P 
2. we bought it from France 3.1 conta 

ntrol of the Mississippi Valley 4, it 

to trade furs for goods 

are exercised by the: 


iti t 
communities 2. Federal governmen 


In general, the arran ехай p 
reason. Th chronological order. For €% 0 


+ n 
as Lapland, China, Germany, France, Mexico, Brazil, Iran, ? 
The student is asked to leap li 


33. The principal racial element in Mexico is 


p 
1. Negro 2. British West Indian 


dia 
3. white — 4. American IP 


— 


MEASUREMENT OF THE SOCIAL SCIENCES 189 


to items such as 


А 34. { 
ipe chief export from Chile is 
- wool 2. meat 3. nitrate 4 coal 


G . 
hs ieee e the advanced battery deals with topics similar to those 
ads aan с iate battery. Its 53 items also ask questions about prod- 
А ман" ions, location of places, rivers and lakes, population, and 
kot com P ciae variety of states and countries. Questions are asked 
Velo, ins a’s most important natural resources, Java's leading prod- 
DNA rm ing occupation of the Chinese, and what Yugoslavia is ex- 
Жейн "ue Many of the questions involve interpretation such as 
epi can produce much cotton, why southeastern Alaska has 
more rapidly than other sections of Alaska, and what the 


smelti А 
ting of iron ore requires. 
mpling in selecting test items 


I Y 
he weakness of the use of judicious 52 
1 studies has been suggested. It does 


for tect: 
нь а. achievement in the socia. | 
eima the standpoints of curricular validity and of problem 
The S ation, items could be grouped around natural centers. 
€ Stanford Achievement Tests also have tests on social studies. 


‘ocial Studies 


sa y of the older tests of geography and history for the elementary 
m are now out of date: They ar no longer valuable because their 
conse are concerned too largely with small bits of information and 

quently emphasize interpretation too little. For those interested 


ew are listed at the end of this chapter. 
"he he Cooperative Social Studies Test for Grades 7, 8, 9 is one of the 
er tests and one which undoubtedly is to be used in the upper 
t. Also worthy of con- 


r; А 
Ades, Tt is reviewed on pase 195 of this tex 
is the Kelty-Moore Test of Concepts in 


Siderat: 
ation in this connection › 
s, of 35 concepts each, available 


the So; 
for ha al Studies. There are two form: i 
th, Sting concepts acquired in the social studies from grade 4 through 


eius: 
Junior high school. 


Specific Tests of the 5 


Geography Tests 
ests suitable for testing in the elemen- 
ted because they exemplify at- 


Th 
ere are several geography t 
derstandings gained from the 


ary 
tem: School. Two tests have bee" sclec 


stud, їз to measure techniques and un 

TP. 9f geography rather than disconnected facts. | 

that © Wiedefeld-Walther Geography Test is an illustration of а test 
diy; although old (1931) is still good because it was well built. It is 
" ith three subheads under each part: 


ed 3 
d into two parts, Wi 


190 PROBLEMS OF MEASUREMENT 


Part 1. Study abilities in geography 

Test I. Reading 

"Test II. Organization 

Test III. Map and graph reading 
Part 2. Geography information 


ess 

y desirable characteristics, its useful? 
d it is now out of print. ils te 
There is another test, too, which tries out the ability of PUP ills 
interpret maps, graphs, charts, etc. This is Test B, work-study + 
of the Iowa Every-pupil Tests of Basic Skills. Two parts of Test dies 
directly on the problem of measuring the outcomes of social stu 

Part I. Map reading— Sections A, B, and C 

Part V. Reading graphs, charts, and tables 4 with 

Part I has three sections, containing altogether 40 question m 
appropriate maps for each section. All maps are artificially cons ow? 


but include significant facts. An example with two questions 15 5 
in Fig. 17. 


It is thus clear tha: 
tended as a test of 


MEASUREMENT OF THE SOCIAL SCIENCES 


Part I. Map Reading 
Section B 


Directions: " 

н The questions to the right are based 

S n imaginary land. Answer the questions in th 
ection A, 


191 


on the map below, which is a map 
e same way that you did those in 


16. Which of the follow- 


ing would be a prob- 
able cause of diffi- 
culty in building a 
railroad from К to I? 


1) Lack of water 

2) High mountains 

3) Thick jungle 

4) Lack of wood for 
ties 


18. How does the long- 
est day of the year 
at A compare with 

- the longest day at L? 
( VA 
mo) 1) One cannot tell 
j А A from the map 
= ^ 
} 7 a" 2) They are the 
\ |) same 
NOTA / 3) The longest day 
\ \/ at A is longer 
`9 4) The longest day 
\ ) at L is longer 
% 15 ep, « 
Fi Sx Mountains e Cities pil Tests of Basic Skills. (By permission 


Iowa Every-Pu 


G. 
of y 17. Work-study skills, eS. 
oston. 


0} 
Ae Ughton Mifflin Company; 
ud Studi i used t 
Udies, for the techniques 


Ute 

© 

mes of teaching. 
L 

SECONDARY 5сноо: 


mation an 


e various subjects 


За 
зь, Ples of tests taken from $ “lustrat 
€ social science will first be 1005 


est some of the most desirable 


d Meanings 
which together con- 


ed and evaluated. 


192 PROBLEMS OF MEASUREMENT 


History Tests 


P e been 
Tests of American, European, ancient, and world on, FN | 
constructed. The most used of these are tests of Сара, был 
The Cooperative American History Test is divided in dm events 
The first part, consisting of 62 multiple-choice items, - m samples 
up to the Spanish-American War. The second part, of 36 ite ^ Lar in 
events occurring between the end of the ier wire " purely 
the time the test was constructed. While much of the tes sn a fun 
factual, there is a definite attempt to present the questions 
tional, meaningful way. Illustrations from Form Q follow: "- 


5. Large plantations were not established in the N 
because 


5-1 these colonies prohibited slavery 
5-2 most of the people lived in villages and towns 
5-3 nearly all capital was inves 
5-4 the soil and climate w 
15. The outstanding hero of 
15-1 Nathanael Greene 
15-2 Ethan Allen 
15-3 George Rogers Clark 
15-4 Light-Horse Harry Lee 
The Monroe Doctrine was intended to 
25-1 end our alliance with France 
25-2 prevent trade betw 
25-3 promote American 
25-4 prevent European 


+06 © 
ew England colonies 


ted in commercial enterprises 
ere not adapted to such a system 
the Revolutionary War in the West was 


25. 


on 
arst col" s, 
ver the period between the first ^ie 


items deal with pre-Revolutiona"y гер! 
р 50 


of an averag 
amount of instruction.” 


tory bases of compariso 
from one test to anothe 


Let us look more closely at these scaled scores used by = 
the Cooperative tests, A quotation from the Cooperative 
will throw more light on the meaning of scaled scores. 

For example, the “50 point” 


Test represents the score on t 
typical instruction in the tw 
following characteristics: (1) 
grade (where little selection 


€ child in average school VERB satis! i 
The scores run from 1 to 100 and аг up 


m! 
n both between tests and for the 5% 
r. 


0 
py p 
moo 


on the Cooperative Ди : 
his test made at the end О ying е 
elfth grade by а student = ‚ еуел, 
intelligence quotient in t e] 10 
has occurred) between 98 

1 Items by permission of Educational Testing Service, Princeton, NJ. 


MEASUREMENT OF THE SOCIAL SCIENCES 193 


(2) ge betw 4 15 as of grade 9.0 3) score of 92 on 
а een 14.25 and 14.75 f 8 

: 5 ©, 
the New Stanford Achievement Test at grade 8.4. ш ; 


al and to extend for 5 standard 


B : 
y assuming this group to be norm 
m the mean, there would be al- 


deviati А 

PAE in either direction fro 

lis БТ үчн не та units. Now if we divide each of these 

бене о D е ра ler parts, we have exactly McCall's T-score with a 

esto кеша] an S.D. of 10. The 100 units along the base line are as 
qual to each other as are any other units known to mental 


measurement. 


nn 
T En other tests of American 
ooperative Modern European History Test satisfies more nearly 


the criter} ^ 
үшү, for judging such a test than any other test of European 
y.! It is divided into two parts. Part Í contains 62 items of the 


multi Р : 
Pa аш variety. It deals with the understanding of ‘‘funda- 
movements and instructions as well as of personages, locations 
t attempts, without too much Suc 


an > 
a Tema events.” The second par 
› to measure historical judgment with 35 items. Perhaps a sample 
‚ something of the content and the 
om Part I (Form Q): 


History appears on page 203. 


Or tw 
manne, from each part will show 
is er of testing it. The followin 
-0 
ne of the reasons that led Gustavus Adolphus to engage in 


er territory lost after the death of Charles XIL 


otestants. 
dinal Richelieu. 


g items are fri 
the Thirty Years’ 


Hd was 
tE de desire to recov 
0.3 he desire to aid the German Pr 
10-4 pisse resentment against Car 
- The iiic of losing Norway. А 
44 L of the Bastille in 1789 was i 
21 afayette was imprisoned there. 
arge quantities of munitions and firearms were stored there. 


às it was strategically located. 
it symbolized the tyranny 0 


mportant because 


24 


f the government. 


Th 

e Р 

м following example 18 from Part П (Form О): 

` The method proposed in the Covenant of the League of Nations for the preven- 


t 
a of war was the 
33 [Sea of a sup н 
3.3 olding of an internatio 
е establishment of compulsory 
abolition of armaments. 


he f 
acts utilized in this t 


Brea 

ta . 
Socia] n emphasis on polit : | 
Ones. Its norms based 0? 6,000 cases from 


erstate with wide police powers. 


nal plebiscite. 
arbitration. 


est ате well ected but with possibly too 
‘ttle on economic and 


1 
‘Das Permiss; А 
“ineeton, N J for using Cooperat! 


194 PROBLEMS OF MEASUREMENT 


ith a single 
reported in scaled scores. The reliability of the test, .91 with E 
E ; à "er. 
gos eer à these tests just described the гаа аре i world 
M has published satisfactory tests in ancient — sation are 
history. The same principles of construction and p dimmi history; 
used as were employed in the tests of American and M State Teache" 
"Three other tests constructed by instructors at [ашан d the Kansi 
College are worthy of consideration. These tests are ca "History Test: 
American History Test, the Kansas Modern European 
and the Taylor-Schrammel World History Test. 


Economics Tests 


„ч. ТЧ 
= А omics; 0 
For those high schools which give а separate test in га б 
Cooperative Economics Test is available.’ This test con 


ts 
: teme 
ms to be answered by matching st@ # 


а5 d 
Thef 
ice items: ^^. 
multiple-choice items, Part TT contains 30 multiple-choice taste 
1s a wide sampling of the field of economics. Here is an 
sample of the matching items: ) 
1. Special assessments 22. Largest source of revenue to the 2X 
2. Income tax federal government 16 ) 
3. Poll tax 23. Largest source of revenue 23( ) 
4. Sales tax most local governments 24( 
5. General Property tax 24. Can be easily shifted m fir? 
The following are two samples of the multiple-choice items, 
from Part I, and the second from Part IT: 
36. A factor tending toward inflation is 
36-1 an unbalanced federal budget. 
36-2 increased Production of consumer goods, 
36-3 rising taxes, 
36-4 labor troubles, л 
27. Which term best describes the United Mine Workers of America! 
27-1 Trade union 
27-2 Industrial union 
27-3 Affiliated union b 
27-4 Company union or? 
ntile P” (of 
The test consumes 40 minutes of testing time. It has perce 
at the high school and со] 


ч toly . 
lege level and a reliability ee oA 
ordinary purposes. It has een criticized because Part lud erm у, 
Item 10 instead of Item 1 ang Part II with Item 20. ur 
a few of the answers to the it 


MEASUREMENT OF THE SOCIAL SCIENCES 195 


Civics?’ Tests 


F 5 few high schools stick to 
Go government, in which case t 
Ж vernment Test might be helpful. 

vanced course in high school. It is 


Es up of 108 true-false items. Part s 
ith five matches to be selected from eight possible ones. Part ITI con- 


tains 24 multiple-choice items, there being five choices for each item, 
E IV is constructed of 23 completion items. All told, the test samples 
n Wide area of the subject and uses 90 minutes of time. Its percentile 
E nis are based on a rather small number of high school and college 
"dents. Its reliability is reported as 88 as computed by the Spearman- 


town formula. There are two forms of the test. 


the older type of courses in civics and 
he American Council Civics and 
This test is suitable for use in an 
divided into four parts. Part I is 
II contains 13 matching exercises 


Testing Problems, Skills, and Procedures 
The second method of approach to teaching of the social sciences 
*mphasizes the focusing of facts upon the problems of the present. To 
“cure a greater understanding of today’s problems, emphasis must be 
Placed on understanding of what is studied. To understand, the student 
18 read with understanding, must be acquainted with the special 
erms embodied in reading and in addition must know the techniques 
o reading graphs and tables and of discovering the sources of infor- 
mation, He must know when to use encyclopedias and atlases and how 


? take advan table o r an index. 
3 е shall ктш illust as samples of what such 
Sts do. (1) the Coo erative Test for Grades 7, 8, 9, 
0 ће Cooperative еле Achievement Test, Form X, and (3) Test 
Critica] "Thinking in the Social Studies. Other tests do the same, but 
E арз less well i will be noticed that these include tests for the ele- 
entary school as well as the secondary school ct 93s divided int 
th е Cooperative Social Studies Test for Grades , 8, 9 is n > into 
e Ce Parts, Part I, Facts Skills, and Applications, consists of 75 items, 
rao With five xd ices in "the answer.’ It consumes 40 minutes of time 
hy € takin a pem of the test cover à variety of subjects. One 
às to (rema e ons s tà why the United States has decided at 
iet time to a aaa larger паху, why Americans were more concerned 
Out the Ts uild a larg han about the Second World War, and for 
y e First World War | as problems on the inter- 


: ed. It h 
Basoline taxes are most 0 “к ‘jlustration of the manner of test 


tio 
свари ang um ing items (Form R 
N.J. Items by permission. 


1 
E 2 й 
‘cational Testing Service: Princeto?s 


196 PROBLEMS OF MEASUREMENT 


И oore! 
44. Which one of the following has worked to the advantage of m the p 
| section of the country at the expense of the more prosperous sta 
44-1 The workman's compensation law 
44-2 The federal relief system 
44-3 The wages and hours law 


44-4 Tariff on manufactured articles imported into the United States "m ) 
44-5 The method of collecting the income t 

48. Which one of the following would be the 
question: “ What air line carried the most f. 
48-1 An encyclopedia 
48-2 The Reader's Guide 
48-3 An atlas 
48-4 The World Almanac 


480) 
48-5 Who's Who in A merica 


ts to pe 
Part II, Terms and Concepts, consists of 45 terms and eg rie р 
defined in 15 minutes. Such terms as * cabinet," M. eb 
"legislature," “revolt,” “diplomat,” and “levees” are illustra 


а е 
ах є. 
best place to find an answe 

reight during 1939? 


24. Customs duties are collected when 
24-1 goods are brought into a country. 
24-2 people рау an income tax, 

24-3 checks are cashed at a bank. 
24-4 a tax is collec 


ted for each article bought 
24-5 people are fin 


24( ) 

ed for breaking a law А АІ 
> А А о а 
nprehension and Interpretation, is made wr ac Jr 


din 
: : à s have 
reading passages from the socia] Sciences. Percentile norm 
derived. 


e red iiU м 

operative test series. It is divided | loved 
parts. In Part I, Terms and Concepts, 15 minutes is the Dont is кў 
for identifying the correct definitions of 50 terms. The studen”, 


4 
i iatiom» „је 
to know the meaning of “the р Death,” **depreciatiO™, qal, 
buster,” “enfranchised,” “plutocracy,,” and ¥ federation. vor 
upon to know the principal 


а 
Tesults of the Crusades, what the 


MEASUREMENT OF THE SOCIAL SCIENCES 197 


of a short 

жо А se к апа what an agrarian economy is. Part IL 
employing seven sl nterpretation, 1$ pretty largely a reading teet 
p ж n : hort paragraphs and one graph about which questions 
боне fis nt ole test, which takes 25 minutes of working time, is tc 

The еу | ity to read and interpret such material. н 

by J. Wayn ree Test of Critical Thinking in the Social Studies 
"ees es Vrightstone, is divided into three parts, each of which 
grades 4 to нту in taking. In the elementary series, meant for 
Bether ask s art T itself is divided. into three sections which alto- 
Prices, of th questions. The first section of Part I furnishes tables of 
pro ees rng of hogs, and of population, location, principal 
then asked jw graphs of production and altitude. Questions are 
Questions irectly on these data. The second division consists of six 
ТӨ Dun am on the location of facts. The third division is on the capacity 
distincti index. Part 1I, on drawing conclusions from facts, is more 
i ive than any other part. T he instructions themselves indicate 


immedi * 
ediately a different sort of test: 
Mark wi 
e (+) every statement which is true and can be proved by the facts stated 
every statement which might be true but cannot be proved by the faata 
stated. i 


) every statement which is f he facts stated. 


alse as shown by t 


He Я 
ге іѕ а 
n example from the test: 


ШШ. w 
"еа bricks are taken out of the kiln or oven they are red and very hard. They 
aay for use. Bricks will last for hundreds of years. They will not decay 
fall to pieces as wood d burn. They are not costly. These 


qualities make bricks very use and they often take the place of 


Wood, 
9( ) 


uality and make 
hich sometimes 


oes. They will not 
ful in building 


D A 
9. Bricks vary in price, q 
make them 


10. Bricks have many good qualities w 
1 more useful than woo " üt 7 
1. Bricks have many more lasting qualities than wood nu) 
il rather quickly and are so expensive, they 
12( ) 


1 
2. Because bricks Spo 
cannot take the place of 


р 
; atchi III, on applying general fa 
ipte test for each paragraph. 
€ the procedure used. 


wood 
ne paragraphs with а 


cts, consists of ni 
and one sample will 


The directions 


ber of paragraphs. Below each paragraph 
aragraph. In the left hand column are five 

$ E 
1 Ments. Three of those statements will help you to understand the three refer- 
plications, Teachers College, 


"m 
Sity, X Dermission of Bureau of PU 
w York, and of J. Wayne wrightstone: 


Di 
irect; 
Ctions: This section has 2 num 


Wo 
St Sets of statements about {ер 
Columbia Univer- 


198 PROBLEMS OF MEASUREMENT 


ich 
mn whi 
in the right hand column. Select a statement from the left hand соп ue 
нана кч а reference in the right hand column. Write the number о 

est ех 
ment in the space after the reference, 


1 many 
VI. Although new traffic rules are being made all the time, pen 
і automobile accidents. Every year thousands of people are ki e^ Althoug 
drivers. Hundreds of children are killed while playing in the stre business 2? 
there seem to be too many automobiles, they are very useful in WC 
transportation. The building of elevated roads 


and the invention of ne 
devices would help reduce accidents. 


1. Most laws are made to help the 16 
people. 

2. Machines have helped us make 17 
greater progress. 

3. Improvement of machines needs 18. 
an inventive people. 

4. Transportation follows natural 
roads. 


Industry in these days needs 
Science, 


5 ате 

. Explains why traffic rules E ) 
Se Upi acier sea Бы MARE € 

. Explains why automobiles 2 "( ) 
important in industry.. - ices 

- Explains how new safety dev! y 
may help reduce accidents. · : 


on 


This Test of Critical Thinkin 


jal 
spec 
5 in the Social Studies deserves P ant 
consideration because it 


:mport? 

attempts to measure one of the most tnt ink 
ruction, i.e., critical thinking. Such dy сей 
deration of the facts which have alrea it 


: 1 
Part II, on drawing conclusions, and Part III, on арріуірё iti, 
facts, contain much that would fall under the category ° jou 


: І? 
the nature of critical thinking. od? 


The results from this test correlate highly with scores ОЛ 


MEASUREMENT OF THE SOCIAL SCIENCES 199 


School Achievement Test and the New Stanford Achievement Test as 
у аз with McCall's Multi-mental Scale, a scale of intelligence. These 
Ln indicate that perhaps critical thinking enters into the taking of all 
э and plays a large part in reading. They also imply that perhaps 

ls test of critical thinking is nothing new, after all, but another test 
А the skills demanded in the mastery of the materials of social studies. 
The test has satisfactory reliability and a manual which offers excellent 
instructional procedures to use with those pupils who have low scores 
9n the test. 


Tesis of Social Terms 
material in the area of social science 


May be measured (1) by the number of questions asked about a para- 
Braph or selection, or (2) by selecting those terms that are characteristic 
of treatment of social relations and making a test for them. Among the 
tests of social terms are (1) the Wesley Test in Political Terms, (2) the 

esley Test of Social Terms, (3) Pressey’s Test of Concepts Used in 
the Social Studies, and (4) the Kelty-Moore Test of Concepts in the 


9cial Studies. 

The Wesley Test in Political Terms is composed of items which are 
functional and which have wide applicability. The test terms were 
Selected from the Krey-Kelley list of 4,000 words and terms used in the 
Social sciences. The separate items were evaluated by 27 college in- 
Structors and 13 members of the working staff. Political terms included 
ose with military, diplomatic, and legal implications and other terms 

ich are related to government. After considerable experimentation 
22е final test was cast in the best-answer type and has four forms of 10 
items each. The reliability is .68 for each part but when all 40 words are 
Used the reliability is satisfactory, for individual diagnosis. The Wesley 


est in Social Terms differs from the Wesley Test in Political Terms 
(1) in selection of items, and (2) in length. This test includes items from 
the social studies instead of from one area alone. There are 80 items 
each form. The correlation of each of these tests with intelligence 
ests, with reading and with tests of civics indicates that while these 
tests are somewhat related to all of them they also measure something 
Ea Ме different. Samples of terms measured in the үс of е 
€ = » (6 
“ш “smuggling,” “sheriff,” Cea $ poms ana? pras 
ag темей,” “public utility, ES БЕ Merge think 2 
” . 
on,” “penalty,” “paternalisro ! 


8 tion implies that all the 
us m of construc i 
nece of the best-answcr а the ane is to ре selected which most 
Nea, 


er: а 
tly = ү md «The form. allows лы ш 
attrac us = Lege rtially correct ideas, and of а variety of 
; associations, 


The understanding of written 


for 


200 PROBLEMS OF MEASUREMENT 


эз e two tests 
tions based upon similarities of sound and form."! Thes l 

© made originally for grade 12 but are а ts in the Soci? 
окса in college. The Kelty-Moore Test of Concep 
Studies is intended for y 
56 items for tryout in each grade 
the junior high school. These ite 
authorities on the teachi 
experimentally on 100 fourth- 
100 eighth-grade pupils. Е 
and divided into two for 


Con: 


te 
: e selec 
ial Sciences.? The terms for this test мрен E 
tds which had been collected ug: professors 0 
Sources and then evaluated by 64 high school teachers, 5 p t 


"uve to 
s tive 18 
ividuals specially trained to be sensi hu 


C, an. 
terms were arranged in Forms AB C. and P. 
containing 85 items in Form A and 80 each for Forms B, t 
These are not parallel { i 


16. Which Word refers to the affairs relatin 
(a) foreign (b) international 
41. What happens wh 
(a) it becomes less 
the mint 


E to one's own country? aie 
(c) domestic (d) diplom to 
en money depreciates? go pack 
valuable (b) it will buy more (c) it has to 

(d) it can be used in foreign Countries 

The following two items are from Form B: 
34. Which word ref, 


ers to politic. 
(a) graft 


al Corruption? 
(b) lynching 


(c) revolt (d) mutiny 
66. What is the Outer edge of 4 civilized area called? 
(a) metropolis (b) suburbs с) frontier 


(d) seacoast 


wit? 
acd at 
е d 
eading and talking world they ^, 10" 


e Pye 
: ma te 
vocabularies have of 


P- 222 (an article by Edgar B. Wesley). 
? Published in Kelley and Krey, ор. cil 


MEASUREMENT OF THE SOCIAL SCIENCES 201 


decr ed this reason that these tests of terms and phrases areso 
js m А hile some of the tests described are not as well standard- 
inea e should like, they represent a movement in the right direction. 
od ests should be supplemented by teacher-made tests of terms, 
What is learned about social interaction may be clear and well 


Understood. 
Measurement of Attitudes in the Social Sciences 


щ Chap. 17 appears a discussion of the formation and measurement 


of : x 
attitudes. The present treatment presupposes what is there pre- 
pts to measure some of those atti- 


Sen 
ted and offers a sample of attem 
directly out of courses in social 


ud а 
‚ ©5 Which are supposed to grow 
trumental in determining the action 


Scie " 9 
which Since attitudes are so 105 | 
е Ch is taken, they need great clarity in definition and precise instru- 
nts to measure their attainment. Unfortunately, neither of these 


Out; л : 
Comes has been satisfactorily achieved. . 
{ а series of statements with 


whi € usual attitude test or scale consists o ‹ 
+ m the subject may express agreement or disagreement. Such a 
ale is the Wrightstone Scale of Civic Beliefs which! is suitable for 


Er, : " 
ades 9 to 12, This scale is divided into four parts: 
Statements 


1. Racial attitudes 
+ International attitudes 

I. National political attitudes. -.:::::77 i n 
+ Attitudes toward national achievements and ideals......... 


Und 
State, A. If you disagree, make 
Ina, ts and use а question 
tations are selected from each part. 
A D 


S, т 
n 1 e White race is no better nor worse than other races. 
bit Chinese immigration. A D 


he United States should prohi 


М 


е 
26 Next two are from Part П: | 
35, i of our immigrants are undesirables {гот other nations. A D 
| qu United S y ]d pursue а liberal policy towards immigra- 
tion, tates should р ^ " 
he 
bs ea two are from Part ЇЇ: р" 
*u i i ntry. 
82, Е. а traitor refuses to fight for reni — -— 
tion Dess and industry increasingly В "n 
ems used by permission. 


, Don, 
Wo | 
"ld Book Company, Yonkers уны 


202 PROBLEMS OF MEASUREMENT 


Finally, here are two from Part IV: 


D 
А z A D 
65. Most criminals tend to be feebleminded and ignorant. A 
75. Only radicals and socialists join labor unions. 


rades 
The scale is easily scored and furnishes percentile oe -— d 
9, 10, 11, and 12. The percentiles thus obtained indicate ү mee 
liberalism or conservatism which an individual possesses. eher in Г 
if a subject receives a percentile Score of 75, this means th 


: oup 9? 
vidual is more liberal than 75 per cent of the standardized gt 
less liberal than 25 per cent. 


ple: 


in 
: : : : e resent an i ] 
The test was validated by using only items which wer р 


21. 
common use in textbooks. The items used were checked by 
scientists as to whether th 


as being liberal or cons 


too limited experiences 


; and they may prevaricate. Each O 
discussed in Chap. 17, 


SUMMARY 


Two general &pproaches to the problems invo 
Science 


ja 

А 500} 2 

Ived in teachi ден 

have caused the measuring instruments to be ppm g Le, 


s c 
9 presented by means of which attitudes 
registered. 


:4] sch 5^ 
In general, it was found that the Objectives in teaching oc 2. 50 
are very numerous and that many of them have not as yet нар? у 
factorily measured. Among these latter аге interests, social Ре istat 
both in school affairs and in after- 


М ath 
School life, and attitudes. tion bo 
objective tests of a student's ability to marshal his informa 


enc 


MEASUREMENT OF THE SOCIAL SCIENCES 203 

sin : i е PORE 

ve topic and arrange it in a convincing manner have notas yet been 

fee ructed. Despite the tendency of many tests of the special subjects 
mphasize the acquisition of information as such, it still may be 


tr 
uthfully averred that there are many useful standardized tests in the 


Social sciences. 


LIST OF TESTS IN SOCIAL SCIENCE 


К I. History 
merican History 


Tet б: operative American History 
X high school. 1933-1940. Forms 
Бе Y. Time: 40 minutes. Authors: 
Dicat R. Anderson, E. F. Lindquist, 
e te W. Croon, and Harry Berg. 
E nin Test Service, New York. 

high ee American History Test, 
levels a and college. Two forms. Two 
ат: 40 minutes. Authors: 
E Hartung, H. E. Schrammel, and 
Ure erat Bureau of Educational 
Colle rements, Kansas State Teachers 

e Emporia, Kans. 
In Am Oordinated Scales of Attainment 
1933 Tom History, grades 1-8. 1932- 
Authors. y form. Time: 45 minutes. 
agenen Mary G. Kelty and M. J. Van 
ie Educational Test Bureau, 
Polis, Minn. 
Achie есап History Test, National 
1o Vement Test, grades 7-8. 1937- 
Minute wo forms. Nontimed (about 50 
Lester s). Authors: Robert K. Speer, 
‚ Crow, and Samuel Smith. 


Со; E 
Ny. Publishing Co., Rockville Center, 


5. 
ud of Factual Relations in 
Two pun History, grades 10-12. 1936. 
Eo Nontimed (about 100 min- 
ional uthor: Eugene S. Farley. Educa- 
est Bureau, Minneapolis, Minn. 
Test, high 
.91. Fac- 
kills, 16 


à a American History 
tual jig ols Reliability: .8T- 
items. «mation, 28 items; 5 


Natio, interpretation of historical infor- 
опса? S items; understanding of his. 
Info TOCesses, 26 items; cece 

m- 


Pany C65, 12 items, World Book Co! 
› Yonkers, N.Y. 


World History 


1. Cooperative World History Test, 
high school. 1934-1937. Forms X and Y. 
Time: 90 minutes. Authors: H. R. 
Anderson and E. F. Lindquist. Coopera- 
tive Test Service, New York. 

2. Taylor-Schrammel World History 
Test, high school. 1936. Test I, first 
semester; Test II, second semester. 
Time: 40 minutes. Authors: Wallace 
Taylor and H. E. Schrammel. Bureau of 
Educational Measurements, Kansas 
State Teachers College, Emporia, Kans. 

3. Iowa Academic Contest, Every- 

upil Tests, high school. New forms each 
year. World history. Bureau of Educa- 
tional Research and Service, University 
of Iowa, Iowa City. 

4. Cooperative Contemporary Affairs 
Test of High School Classes. 1940. One 
form for each year. Time: 120 minutes. 
Authors (1940 edition): Alvin C. Eurich, 
Elmo C. Wilson, Edward A. Krug; et al. 
Cooperative Test Service, New York. 

5. Iowa Academic Contest, Every- 

upil Tests, High School Contemporary 
Affairs. Bureau of Educational Research 
and Service, University of Iowa, Iowa 


City. 


European History 

1. Cooperative Modern European 
History, high school and college. 1937- 
1940. Forms N, O, P, and Q. Time: 40 
minutes. Authors: H. R. Anderson, 
Wallace Taylor, E. F. Lindquist, Char- 
lotte W. Croon, and Mary Willis. Coop- 


erative Test Service, New York. 
Modern European His- 


2. Kansas 
tory, Test П, high school. 1938. One 
form. Time: 40 minutes. Authors: 


204 PROBLEMS OF 
Alvin L. Hasenbank and H. E. Schram- 
mel. Kansas State Teachers College, 
Emporia, Kans. | | 

ЗР American Council European His- 
tory, grades 10-15. 1929. Two forms. 
Time 90 minutes. Authors: Harry J. 
Carman, Walter C. Langsam, and Ben 
D. Wood. World Book Company, 
Yonkers, N.Y. . . 

4. Vannest Diagnostic Test in Modern 
European History, high school. Bureau 
of Cooperative Research, Indiana 
University. 


Ancient History 


1. Cooperative Test in Ancient His- 
tory, high school. 1938-1939, Forms 
O and P. Time: 40 minutes. Authors: 
Howard К. Anderson, E. Е. Lindquist, 
Wallace Taylor, and Charlotte W. 


Croon, et al. Cooperative Test Service, 
New York. 


IL Civics AND GOVERNMENT 


1. Cooperative Test 
Government, high school. 
Y. Time: 40 


in American 


tive. Coopera- 
York. 

Community 
4. Key made 
unity. Time; 30 
Ray A. Price and 
n. Cooperative Test 


Council Civics and 
Government Test, high scl 


hool and col- 
lege. 1929. Two forms. Time: 90 min- 
utes. Authors: Rober D, 


х Leigh, Joseph 
D. McGoldrick, Peter H. 


Odegard, and 
Ben D. Wood. Reliability: gg. World 


Book Company, Yonkers, N.Y. 

4. Iowa Academic Contest, Every- 
pupil Tests, American Government, 
high school. Bureau of Educational Re- 
search and Service, University of Iowa, 
Iowa City. 

5. Mordy-Schrammel Elementary 
Civics Test, elementary grades and 


MEASUREMENT 

z state 
junior high school. Kansas 
Teachers College, Emporia, 

6. Hill Test in Civic дашы 
6-12, Public School Publish! 
pany, Bloomington, I. 

7. Hill Test in Civic em 
grades 6-12. Public ee 
Company, Bloomington, In. үс Actio 

8. Hill-Wilson Test in CN publishing 
grades 6-12. Public Sp 
Company, Bloomington, Ill. 


mation: 
plishin8 


III. Economics 


1. Cooperative Economic 
School “г college. 1939. ep oe 
Time: 40 minutes. Authors: q 
Anderson, J. E. Partington, ies 
erative Test Service, New YO ata TOY 

2. American Council uod Воо 
high school and college. V 
Company, Yonkers, N.Y. “ 

3. Iowa Acadeniic et Pet r 
pupil Tests, high school Саг on 
Bureau of Education Ret owe Сі 
Ѕегуісе, University of Iowa; 


IV. SocroLocY ў 
1. Black-Schrammel Soci res 525 
high-school and college: Kant, 
Educational Measurement i, s 
State Teachers College, EMP 


Өй 

st, hig? 
s p and $ 
ard Р“ 
Coop" 


GRAPHY 
V. бео Geog" 


1. Wiedefeld-Walther pour for 
Test, grades 4-8. 1931. N M 
Time: 60 minutes. Authors P 
Wiedefeld and Е. Curt Walt yy, 
Book Company, Yonkers, ractic? 
2. Brueckner-Cutright raph Y, oo 
ercises in Locational Geofrey sch, 
mentary grades and junior inne 
Educational Test Bureau, 
Minn, 


E pe 
VI. Ѕостат SCIENC" ig th 


iencY "o 

1. Test of General eee se E 
Field of Social e jev” 
Cooperative General 
Tests, revised series. Form gary 
Time: 40 minutes. Authors service 
et al. Cooperative Test 
York. 


MEASUREMENT OF Т 


au eni Test of Social Studies 
0. Time. high school. 1916-1939. Form 
Wü ү, 80 minutes. Authors: J. Wayne 
Se ghtstone ef al. Cooperative Test 
| EE New York. 
Se of Critical Thinking in the 
Tuo T tudies, grades 4-6. 1938-1939. 
n | эи. Time; 45 minutes. Author: 
A Wrightstone. Bureau of Pub- 
Univ ns, Teachers College, Columbia 
4 ersity, New York. 
E Social Studies Unit 
ad grades 4, 6, and 8. Kansas State 
5 a College, Emporia, Kans. 
the S elty-Moore Tests of Concepts in 
ocial Studies, grades 4-9. Authors: 


HE SOCIAL SCIENCES 


205 


M. G. Kelty and N. E. Moore. Charles 
Scribner’s Sons, New York. 

6. Wesley Test in Social Terms, grades 
6-16. 1932. Two forms. Nontimed 
(about 30 minutes). Author: Edgar B. 
Wesley. Charles Scribner's Sons, New 
York. 

7. Wesley Test in Political Terms, 
high school. Charles Scribner’s Sons, 
New York. 

8. Kepner Background Test of Social 
Studies in High School. Ginn & Com- 
pany, Boston. 

9. Pressey Tests of Concepts Used in 
the Social Studies, high school, 1934. 
Charles Scribner’s Sons, New York. 


QUESTIONS AND EXERCISES 


M Distinguish sharply between the 
to ША of view which give direction 
2 € teaching of social science. 
r four types of objectives 
Е ate Э ial 
Science, for the teaching of social 
an 3. Distinguish between inert facts 
functional facts. 
tives Го; Name and explain four objec- 
obi. Or Which there are no satisfactory 
Jective tests. 


b. Secure copies of the Metropoli- 


tan A 
Achievement Tests and the Coordi- 


na 
ted Scales of Attainment. Make à 
) items, 2 


Car 
ful comparison of the (1 
nd (3) gen- 


Sam А 
Pling of historical facts, 2 


ment of children's interests in social- 
science activities? 

7. Critically evaluate the use of the 
multiple-choice type of question in 
measuring the outcomes of history 
teaching. 

g. Explain the meaning of a scaled 
score. What are its uses? 

9. What are some of the outcomes 
tested by the Cooperative Social Studies 
Test? 

10. Do you 
of Wrightstone 


thinking? Why? | | 
11. Describe Wrightstone's Attitude 


Scales. Enumerate its strong points and 


its weak ones. 


think such a test as that 
really tests critical 


{ the tests mentioned 


ега) 

value of their two history tests. a 12, Which o 
cur’ the pa н sees i ieme dents ја measures the capacity to read in the 
lub ^ M S social sciences? 

s. + 
How would you arrive at a judg- 
BIBLIOGRAPHY 

Bs The Forty-fifth Yearbook of the Na- 

Bun г , tional Society for the Study of Education, 
Forty | heus к. (ed: fig orent Part 1, атре Measurement of Under- 
tems 1614 . Measuremen "p к, NJ: standing," Chap. V. Chicago: Univer- 
Te -1642. Highland PP оок, sity of Chicago Press, 1946. 

dies | GREENE, HARRY 2» ArBERT М. 


1941. ental Measurements 


: The Third Mental Measure 


ments дра 
Brung Yearbook, Items 9. Кт 
эы Wick, N.J.: Rutgers University 


S88, 1949. 


and J. RAYMOND GER- 
BERICH: M easurement and Evaluation in 
the Secondary School, Chap. XVII. New 
York: Longmans, Green & Co. Inc., 1943. 


JORGENSEN, 


206 PROBLEMS OF 


Ketrey, T. L., and Krey, A. C: 
Tests and Measurements in the Social 
Sciences, pp. 1-119, 153-233, 234-339. 
New York: Charles Scribner’s Sons, 
1934. 

Swrrg, EucENE R., RALPH TYLER, 
et al. Appraising and Recording Student 
Progress, Chap. III. New York, Harper 
& Brothers, 1942. 

TOWNSEND, AGATHA: “The Reliabil- 
ity and Validity of the USAFI Ameri- 
can History Test,” in 1947 Achievement 
Testing Program in Independenti Schools 
and Supplementary Studies, pp. 53-58, 
Educational Records Bulletin No. 48. 
New York: Educational Records Bu- 
reau, 1947. 

TRAXLER, ARTHUR E.: 
Guidance, pp. 90-93. New 
& Brothers, 1945, 

WESLEY, EDGAR Bruce: Teaching the 
Social Studies, Chap. XXIII. Boston: 
D. C. Heath and Company, 1937. 


Techniques of 
York: Harper 


MEASUREMENT 


Articles 


nts 
“How Do Senior College SEE 
and Adult Groups Stand on gd 
Test?” School and Society (19 A 
Linguist, E. F.: “The ot of the 
American History Examinatio ication! 
Cooperative Test Service,” Ed? 
Record (1931) 12:459-475. 
Price, Roy A., and ROBERT i 
MAN: Part 8, "Testing for оаа 
Information," рр. 213-225, us cial 
of Community Resources M 
Studies," Ninth Yearbook of E Р " 
Society for Social Studies, L Tistory E 
Reap, James MORGAN: | Esos. gn 
зиз the Social Sciences, Е 
Society (1943) 58:149-151, gressive 
Traxter, ARTHUR E.: wie 0 
Methods as Related to epi Socitl? 
American History," School а" 
(1943) 57:640-643. 


CHAPTER 8 
Measurement of Foreign Languages 


OBJECTIVES IN TEACHING 
riy sought in the teaching of any foreign 


under four heads: 
his involves the ability to 


— objectives customa 
guage may be classified 


E. A knowledge of the language itself. Т 
ad, write, spell, and speak the language. The materials used for 


еы this language may уату from newspapers and magazines 
fhe en in this foreign tongue to selections from its classics. It involves 
$ mastery of vocabulary, verb forms, idioms, agreements among 
ords, inflections, and other minutiae which are needed for reading, 


Speaking, and understanding thelanguage. —, н 

oe An appreciation of the literature written 1n that language. Even 
: elementary courses some acquaintance is achieved with the master- 

Pleces which express realistically and artistically the great experiences 


ч Ашап. н ^ 
Апа sation of the geography: history, manners, customs, an 
Culture Tee e country whose language is being studied. Some 
са ago Nicholas Murray Butler, then President of Columbia Uni- 
esty, spoke of teachers of the foreign languages as the ambassadors 
М 9 represented foreign countries ап who helped students become 
Cquainted with the fine points of their civilizations. They were not to 
Ink о of a language only. 
Е themselves as teacher English. English has 


+ Interrelati п that language and Ў 
borrowed from таг Ger languages. "Thorndike's studies showed 
qat 52 per ies cif ordinary :ng words are derived from the Latin 
s another i Tet cent fro ugh the Latin. Many 
t к have been adopted unchanged burn 
‚2 SO s 5 Майе more $ а a HUE 
With Ee has its pu . If the teacher keeps this objective 
Clearly dd kd pr Mer achieve it, considerable improvement in 
Stay nowledge at dé derivation of English хо oh a better under- 
Ing of t of Е ish can be ac . 
v ihe Ple ше, elf shows attempts to measure many of these 
y of testing 8 h Bureau and the American Council 
07 


lectives, The Columbia aa 


208 PROBLEMS OF MEASUREMENT 


reading: 
on Education have constructed a variety of French all has con- 
grammar, and vocabulary. The Columbia Research poe have 2” 
structed an Aural French Test while Lundeberg and Ard customs 
Audition Test in French. In the area of history, manners, onstructe s 
at least one test, Miller's, French Life and Culture, has been c osition- 
Trabue, too, made a scale to aid in measuring French comp 


THE More MEASURABLE OBJECTIVES age 
nguee 

As time went on and data accumulated on these tests ae ve y 
achievement, it became increasingly clear that the facts га measure 
learning the language itself were more susceptible to accura en areh, 
ment than the other less well defined and less well Ageno 5/8 
At any rate, а careful Study of the most successful HM 
the present time indicates that they attempt to measure t 
specific objectives: 

1. Reading with understanding 

2. Vocabulary growth 

3. Knowledge of functional grammar 

1. Translation into English and vice versa 

Many teachers wish fora 
nunciation. Th 


ollow” 


d pr? 


+ an 
х оп а I5: 
standardized test of conversati € 


| ge?" un 

Попа] and must be selected from thos” {ел {0 

proved to be the minimum essentials for understanding pnt ag к 

guage. And finally, the Selections for reading must be long рого 
! Publications of the Ameri А 


foder” | uc 
сап and Canadian Committees оп M nstf 
Languages contain excellent research m 
(see Bibliography). 


MEASUREMENT OF FOREIGN LANGUAGES 209 


develo i 
: p rather thoroughly one idea and must be arranged i 
Increasing difficulty. ‚ аи 


Because the Cooperative Test Series of the American Council on 


Educati m сы 
ucation utilized to the best advantage principles based on research, 


th 
af is are generally regarded as the leading language tests today. A study 
e 16 double-column pages of critical evaluation of French tests in 


со Е 
и оган than any other 
€ first place among educationa 


o 
а grammar test," and “а be 
Taditional examination of yesterday.” Not all statements are as flatter- 


“he are these, but the general trend is highly favorable. 

and € cooperative tests are issued each year so that new techniques 

tion Criticisms can be embodied in the latest forms. These yearly edi- 
5 make it possible for the new test to embody changes that take 


iv in the curriculum. 
Coo good illustration of the cooperative test series appears in the 
ы d French Test,! revised series, elementary, Form O. This 
ed has three parts: reading (15 minutes), vocabulary (10 minutes), 
and grammar (15 minutes). Scores may be had for each of the parts 

m the test as a whole. 

a reading part has 40 it 
s tel Which is followed by five 
€ selected. The following i 


h item consists of a statement in 
es from which the correct answer 
rom Form O: 


ems. Eac 
choic 
Illustrations are f 


Si 
не ee armée sont 


* 2 des officiers. 
-3 es avocats. 
M ч militaires. 
„e (CS paysans. 
14, о 95 invalides. . se 
ja елеше б pédifice où l'on trou 
42 he Cuisine. 
14.3 ia chambre а coucher. 
ы i: bibliothéque. 

€ pupitre. 

le corridor. 


9 


eaucoup de livres 


Th І К 
е i for the correct answer in 
Хада thee = m its ein difficulty from chaud, lout, and 
Ры, thro gee ne a profond to papillon, lorsque, and 
Onte, ОЧЕ pluie, bâtiment, 810 ows: 

*. Each item is presented as follows: | 
ssion of Educational Testing Service. 


Princeton, N.J. 


tem 
5 of test by permi 


210 PROBLEMS OF MEASUREMENT 

19. fois 
19-1 faith 
19-2 time 
19-3 hour 
19-4 sausage 
19-5 flower 

18. désespérer 
18-1 disturb 
18-2 descend 
18-3 despair 
18-4 deserve 
18-5 describe 


The grammar part has 35 item 
as plurals, idioms, agreement of 
nouns, indirect object, 
The answers are in Fr 


h 
suc 
++ sagt 5 
s largely concerned pem ТЕ А 
pronominal adjective an et 


:eipleS: 
з | 1 
verbs that изе étre or avoir, past particip 
ench. 


26. Are you cold? 
(—__) froid? 

26-1 Avez-vous 
26-2 Faites-vous 
26-3 Étiez-vous 
26-4 Faisiez-vous 
26-5 Étes-vous 

He left immediately, 
Cs) 
28-1 a 
28-2 est 
28-3 était 
28-4 avait 
28-5 faisait 


28. 


parti immédiatement. 


public Secondar 
West and on (2) public Seco 
able. The reliabili 


ty of this test has been variously reporte 5) 
LIST OF FRENCH TESTS "m. 
I. GENERA: ingi Рі, 
prt; berg and Geraldine SpauldiPE? ind "ig 
1. Cooperative French Test. Elemen- 


ing a 
Lr form, Geraldine Spaulding оор“ 
tary form, 1-3 semesters in high School; Vaillant. Time: 40 minutes- gie 
advanced form, 2 years high school Test Service, New York. | ya 
Authors: elementary form, Jacob Green- 2. American Council 


MEASUREMENT OF FOREIGN LANGUAGES 


ze grades 9-16. Two parts. Two 
Puri Part I, vocabulary and grammar; 
a 1, silent reading and composition. 
me: 40 minutes. World Book Com- 
Рапу, Yonkers, N.Y. 
om American Council Beta French 
Fa ро 7-11. Two forms. Part 1, 
Ae m Part II, comprehension; 
utes, W, › Кешш Time: 90-100 min- 
NY orld Book Company, Yonkers, 


ке dete: Council French Gram- 
uns est, grades 9-16. Two forms: 
m 22-27 minutes. World Book 
рапу, Yonkers. N.Y. 
Jia есап Council оп Education 
more ‚ш Test, 2 semesters Ог 
Utes Ww college French. Time: 50 min- 
NY. orld Book Company, Yonkers, 


Test Columbia Research Bureau French 
fedes 9-15. Time: 90 minutes. 
Book Company, Yonkers, N.Y. 


II. AURAL 


+ Columbia Research Bureau Aural 
Test, grades 9-16. Two forms. 
5-60 minutes. World Book Com- 
i Yonkers, N.Y. 

Frenc undeberg-Tharp Audition Test in 
forms ' high school and college. Two 
ws, James B. Tharp, Ohio State 

"sity, Columbus, Ohio. 


1 III. Orner Tests 
Scho French Life and Culture, high 

Mmi and college. One form. Time: 
Bureay te: Author: Minnie M. Miller. 
Kans u of Educational Measurements, 
Зона, katate Teachers College. 
; 10. Two 


forms tench Reading, grade 
` Time: 30 minutes. Department 


1 


SP. 


by the Cooperative 
The Ureau. 
Cooperative Spanish Te 


sts are prepare 


211 


of Educational Research, University of 
Toronto. 

3. French Vocabulary Test, grades 
9-10. Two forms. Time: 30 minutes. 
Department of Educational Research, 
University of Toronto. 

4. Standard French Test, high school. 
Vocabulary, grammar, and comprehen- 
sion. One form. Time: Part I, 28 min- 
utes; Part II, 32 minutes. Public School 
Publishing Company, Bloomington, Ill. 

5. Cooperative French Test, lower 
and higher levels. Lower level, 1-2 years 
high school; higher level, more than 2 
years in high school. 1942-1947. Forms 
S and X. Time: 80-85 minutes. Authors: 
Geraldine Spaulding, Laura Towne, and 
Sarah Woolfson Lorge. Cooperative Test 
Service, New York. 

6. Examination in French Grammar, 
high school. Lower level, 1944, 1-2 years 
in high school, Form LFG-1-B-4; upper 
level, 1945, 234 years in high school, 
Form UFG-1-B-4. Time: 40-45 minutes. 
Authors: Examinations Staff of the U.S. 
Armed Forces Institute. Cooperative 
Test Service, New York. 

7. Examination in French Reading 
Comprehension, high school. Lower 
level, 1944, 1-2 years high school, Form 
LFR-1-B-4; upper level, 1945, 226 years 
high school, Form UFR-1-B-4. Time: 
50-55 minutes. Authors: Examinations 
Staff of the U.S. Armed Forces Institute. 
Cooperative Test Service, New York. 

8. Examination in French Vocabu- 
lary, high school. Lower level, 1944, 1-2 
years in high school, Form LFV-1-B-4; 
upper level, 1945, 234 years In high 
school, Form UFV-1-B-4. Time: 40-45 
minutes. Authors: Examinations Staff 
of the U-S. Armed Forces Institute. 


Cooperative Test Service, New York. 


ANISH TESTS 


he author has selected only those 


Tom ish, t 
the many tests of Spi ihe се and by the Columbia Re. 


d after the manner of 


212 PROBLEMS OF MEASUREMENT 


is 
2 s :unior form» 
their French tests. The Cooperative Spanish Test, junio 
divided into three parts: 


Part Time, minutes 
І. Кеайїпд................. 15 
II. Уосаһшагу.............. 10 
ПІ. бгтаттаг................ 15 


The reading test consists of 40 


hic 
hs Ww 
sentences and short paragrap 

are answered in Spanish (junior 


form).! 


39. A los discípulos que no son 
39-1 castigarlos 
39-2 enseñarles 
39-3 aprenderlos 
39-4 encontrarlos 
39-5 mirarlos 
22. Los hombres que viven 
22-1 conocidos 
22-2 largos 
22-3 verdes 
22-4 ancianos 
22-5 jóvenes 


listos es difícil 


mucho tiempo llegan a ser 


Th 
to hard 

Part II contains 50 Spanish words ranging from easy 

definitions are in English. 


3. comprender 
3-1 understand 
3-2 buy 
3-3 eat 
3-4 take away 
3-5 promise 
22. triste 
22-1 road 
22-2 truthful 
22-3 trunk 
22-4 suit 
22-5 sad 
30. paso 
30-1 price 
30-2 paste 
30-3 part 
30-4 paving 
30-5 Step I 


А . : ceton: 
! Items of test by permission of Educational Testing Service, Prin 


213 
MEASUREMENT OF FOREIGN LANGUAGES 


36. ciego 
36-1 sky 
36-2 seal 
36-3 continuous 
36-4 blind 
36-5 wax 


i a statement in 
= үк мы Durum = Mdb т eem the omission of 
English followed by translation into er m 
а crucial word which illustrates the poin 


16. It is half past six. ; 
(—___) las seis y media. 
16-1 Es 
16-2 Esta 
16-3 Son 
16-4 Hay 
16-5 Están 
He has lost his books. 
а perdido ( ) libros. 
-l Suyos 
10-2 suya 
10-3 de él 
10-4 su 
10-5 sus | 
They have just opened it. 
— — — ) de abrirlo. 
61-1 Acaban 


61.2 Hubieron 
61 


10, 


61. 


i esent in 
E po SKINS forms 
ints ‘ 5). There are 
t The same strong P liability is high (7 BN. орк decem 
Tench test. The геп ile norms аге prepa rid eae: | 
"x © test, and prins e Middle Wesh, x ndi Sedem 
о 9015 in the South, the endáty schools an T B amount ог UME 
is for independent aur advanced, ш repr pere 
onish Test, revised ser! wa mentary test. 
Cas “ach of the tests as doe 


1S more advanced. 


nud ° minutes; 


ised 
serip operative Spanish T ending ut ur stude 
‚е E = 
е iate: c ЖЕТЕК? 10 


ent in t 


are pres 


TESTS | 
А pe Part III, grammar, 15 min- 
ntile norms for high school 
nts. Forms N, O, and 


214 PROBLEMS OF 


ime: 40 minutes. Reliability: .95 
(odds versus evens). Authors: Jacob 
Greenberg, Robert H. Williams, and 
Geraldine Span ding: Cooperative Test 

ice, New York. . 
Ex Cooperative Spanish Test, revised 
series, advanced form. Part I, reading, 
15 minutes; Part II, vocabulary, 10 
minutes; Part IIT, grammar, 15 minutes, 
Percentile norms for high school and 
college. Forms N, O, P, and Q. Time: 40 
minutes. Reliability: .98 (odds versus 
evens). Authors: E. Herman Hespelt, 
Robert H. Williams, and Geraldine 
Spaulding. Cooperative Test 
New York. 

3. Columbia Research Bureau Span- 
ish Test, high school and college. 1926— 
1927. Forms A and B. Part I, vocabu- 
lary, 25 minutes; Part II, comprehen- 


sion, 20 minutes; Part ш, grammar, 


45 minutes. Time: 90 minutes. Reli- 
ability: .97. Р.Е... = 3. Authors: 
Frank Callcatt and Ben D. Wood. 
World Book Comp 

4. Examination 
lower level, 1 


1 year of college. 1944, Form B. Time: 
40-45 minut 


Service, 


tute. Coo 
York, 


MEASUREMENT Е. 
s ‘ch Rea ini 
5. Examination in Spanii 
Comprehension, lower leva 
of high school or 1 year С sM 
Form B. Time: 40-45 Egi hors: 
separate answer sheets. H 
aminations Staff of the к 
Forces Institute. Сооре a 
Service, New York. sp уос. 
6. Examination in Spanish, school 


high 
lary, lower level, 1-2 years of 1944. m 
Spanish or 1 year in college. Mus ү 
B. Time: 40-45 тшшш, ke б 
Separate answer sheets. n ny 

aminations Staff of the tert 
Forces. Cooperative Test w Й 
York. S. 
Т. Lundeberg-Tharp Audition 044 


ege. : 
Spanish, high school and colle£ thor 


Form B. Time: 30 minute à i 
Olav K. Lundeberg and Ја! É a. 
James B. Tharp, College т 
tion, Ohio State University; К 
Ohio. Ехатїл& q, 
8. Iowa Placement. 3 ST. rev 
Spanish Training, Serie s rms A eM 
grades 12-13, 1924-1926. Рон шо, 
B. Time: 43(50) minut ch, GMT 


C. E. Seashore, G. M. Rr stoddPà 
Vander Beke, and 

Bureau of Educationa 
Service, State University 0 
City, Iowa. 


* mesearO с ив 
Erb 


i of 
GERMAN TESTS © phos? 
5 
Tests of German are constructed in the same manner а N 
French and Spanish. : , For? 
The Cooperative German "Test, revised series, elementary 
has also three parts: - 
Part Time, minutes 
is clo NNNM 15 
II Vecabuliy ан ШЫ 10 
MY, Odium. UU 15 pi” 
w 
The test of readin 


to 
2 ers 

А 5 consists of 40 sentences, the ansW J 
are in German. The following illustrations are from Form 


D, 
+ ^ incet? 
1 Items of test by permission of Educational Testing Service, PrP 


wh 


MEASUREMENT OF. FOREIGN LANGUAGES 


17. Um frische Luft ins Zimmer 


13, 


12; 


17, 


18, 


24, 


Zu lassen, öffne ich 

17-1 den Ofen 

17-2 den Schrank 

17-3 das Buch 

17-4 den Mund 

17-5 das Fenster 
- In der Klasse sehen wir die 
Schiiler und 
2-1 den Schneider 
2-2 den Arzt 
2-3 den Lehrer 
2-4 den Kaufmann 
2-5 den Fleischer 
Unser Wohnzimmer ist 
13-1 auf der Strasse 
13-2 in dem Garten 
13-3 in der Schule 
13-4 in unserem Haus 
13-5 im Hospital 
Es ist zwölf Uhr mittags. 
Wir sollten jetzt 
12-1 schlafen gehen 
12-2 frühstücken 
12-3 zu Abend essen 
12-4 zu Bett gehen 
12-5 zu Mittag 


e 
In Part II, on vocabulary, there аг 


Manchmal 
-1 alternate 
74 sometimes 


215 


50 words to be defined in English. 


216 PROBLEMS OF MEASUREMENT 


24-4 forehead 
24-5 stimulant 
43. Sammlung 

43-1 sample 

43-2 similarity 
43-3 appliance 
43-4 collection 
43-5 foundling 


n 


Еа 
i i Rach item has firs t 
Part III, on grammar, contains 35 items. Eac 


;gnifica 
а . signi 
English sentence and then the German translation with es 
word omitted. This answer is found among five German w 
13. An hour has sixty minutes. 


( ) Stunde hat sechzig Minuten. 
13-1 Einem 


13-2 Eine 
13-3 Einer 
13-4 Ein 
13-5 Einen 

11. The beautiful lady is my aunt. 
Die (— a p i 
11-1 schön 
11-2 schöne 
11-3 schönen 
11-4 schöner 
11-5 schönes 

5. Now I Speak only English, 
Jetzt fe Je rius Englisch. 
5-1 spricht 
5-2 sprecht 
5-3 sprach 
5-4 spreche 
5-5 sprich 


st meine Tante, 


LIST or GERMAN TESTS 
1. Cooperative German Test, Ele 


mentary Form, grades 6-9, 1-6 Semes- 
ters. Revised series. Forms N, O, and p, 
Part I, reading, 15 minutes; Part II, 


Р 
gi? Я 
п, Соо? 
à . Part , 
me ui mri y 
mar, 15 minutes. Relia а k. а А 
erative Test Service, Ne PE 
2. Cooperative Germa 


MEASUREMENT OF F 


vance 
ое ау 4 semesters ог тоге. 
хаса А Forms N, O, P, and Q. Time: 
est S es. Reliability: .96. Cooperative 
3 RD New York. 
Test merican Council Alpha German 
orms nom 9-16. 1926-1927. Two 
айы parts. Part I, vocabulary and 
ompositi Part П, Silent Reading and 
ook a Time: 40 minutes. World 
4. Col pany, Yonkers, N.Y. 
man ees Research Bureau Ger- 
oime, Be ae 1926-1927. Two 
art II t I, vocabulary, 25 minutes; 
Part IIT comprehension, 20 minutes; 
Book Co grammar, 45 minutes. 
Аты DAI Yonkers, N.Y. 
German pud Council on Education 
eading Test, 2 semesters ог 
ollege German. 1937-1938. 


World 


ITALIAN TESTS 


9f th 
e Ў Я 
Cooperative Test Service. 


LATIN TESTS 
rative 
nd grammar. 
d in these tests. 
ssibilities as ап 


т 
eS Latin tests of the Coope 
tin te reading, vocabulary, & 
15 ing achers are well measure 
uded which has great Ро 
the," Cooperative Latin 


parts:! 
Part 


т. Reading... 
II. Vocabulary. +: 


ш. 


А 
two k otal score may also be 
wit es of items. In the first 11 
Which 7 essential word or ph 
$ correct appears amon 


1 
Ite 
m { 
5 of test by permission of Educationa 


OREIGN LANGUAGES 


Sui 
itable tests for Italian have been constr 


Test, revise 


computed 
items 
mitted. T 


rase О: 
four other wor 


217 


60-65 minutes. Authors: Examinati 
Staff of the U.S. Armed Forces ты 
tute. Cooperative Test Service, New 
York. ' 

7. Examination in German Reading 
Comprehension, lower level, high school 
and college, 1-2 years. 1945. Form B 
Must use separate answer sheets. Time: 
50-55 minutes. Authors: Examinations 
Staff of the U.S. Armed Forces Institute 
Cooperative Test Service, New York. ` 

8. Examination in German Vocabu- 
r level, high school and college 
1-2 years. 1945. Form B. Must cise 
separate answer sheets. Time: 45-50 
minutes. Authors: Examinations Staff of 
the U.S. Armed Forces Institute. Coop- 
erative Test Service, New York. 

9. Lundeberg-Tharp Audition Test in 
German, high school and college. 1929 
Forms A and B. Authors: Olav K. 
Lundeberg and James B. Tharp. James 


В. Tharp, College of Education, Ohio 
State University, Columbus. 


lary, lowe 


ucted under the leadership 


Achievement Tests also concen- 
The teaching objectives of 
OneLatin prognostic test 
instrument of guidance. 
d series, elementary, form Q has 


Time, minutes 
15 


. In the reading test there are 
a sentence 15 written in Latin 
he omitted word or phrase 
ds or phrases which are 


] Testing Service, Princeton, N.J. 


218 PROBLEMS OF MEASUREMENT 


Form De 
incorrect. Here are some illustrations from elementary 
experimental: 


10. Servus bonus in agris ( 
10-1 armabit 
10-2 laborabit 
10-3 timebit 
10-4 movebit 
10-5 portàübit 
13. Quid in bello timetis? ( ) timémus. 
13-1 periculum 
13-2 agrum 
13-3 puerós 
13-4 flümen 
13-5 oculós 


Y 


пу? 
in Lati р. 
The remainder of this part consists of three рагатаргы | para rehe 
questions in English. Three questions are asked about ay Englis® m Р 
30 Latin words to be defined in For 


8. trés 
8-1 three 
8-2 tree 
8-3 very 
8-4 effort 
8-5 sad 
27. infero 
27-1 flee 
21-2 yield 
27-3 interfere 
27-4 compare 
27-5 bring into 
42. jam 
42-1 since 
42-2 for 
42-3 already 
42-4 though 
42-5 before 


a seni уо 
Part ПІ, on 
in English, its 
Choices among whi 


MEASUREMENT OF FOREIGN LANGUAGES W 


dative cases, and so on are included. Two illustrations are taken from 
elementary Form P: 


7. I gave the queen a horse. 
( ) dedi. 
7-1 Reginam equum 
7-2 Réginae equum 
7-3 Reginae едиб 
7-4 Réginam едиб 
7-5 Régina equum 

- We жеге in the camp. 
~) erümus. 
19-1 In castra 
19-2 In castram 
19-3 In castris 
19-4 Castris 
19-5 In castras 


The advanced form of the Cooperative Latin Test has more complex 
Sentences, The paragraphs to be read, the words to be defined, and the 
Brammar are distinctly more difficult than those of the elementary form. 

€rcentile norms are available for these tests both for high school 

i. Colleges. As for the other Cooperative Achievement Tests, oW 

“Ге norms are furnished for public secondary schools and for in С 
pendent secondary schools. The reliability of these tests 1s reporte 
Tom 94 to .96. 

cause many students find Latin so i 
OBress in mastering the language it is often а moot questio 
whether some ut should take it. Two measuring Pe ae 
b aid here, The first of these, any good intelligence test, ha: y 


i d i ievement in 

t en discussed, Such a test correlates markedly with achievem ae 
Course, The second measuring instrument 1s called a prognostic test. 

E Rm tic Test presents а controlled situ- 


ssons are learned and 


difficult and make such little 
n as to 


led in mount of time 
amoun 
1 a defined be an earnest of future success. 


t achievement as measured by a 


T p this miniature ead ec 
ла i ith subsequ 
^ сошы ada d achievement tests was reported by 
telligence and low on 


eed. 

a Other prognostic tests have be 
ange’s: The Foreign Language Prognosi 
€ Luria-Orleans Modern Language 


en constructed for foreign lan- 
s Test by Percival M. Symonds 
Prognosis Test. The former of 


220 PROBLEMS OF MEASUREMENT 


orrelates 
these, suitable for use in grades 8 or 9, has two iene minutes 
60 and :61 with achievement-test Scores. Its working Se and Jac ob 5. 
The Modern Language Prognosis Test by Max A. 2 eem Spans , 
Orleans claims to measure the ability of em redek grade : 
French, or even Italian. It can be used from grade ton of 68 betwee ; 
The test requires 76 minutes to take. The p been found 
prognostic-test scores and scores on achievement has no om ranging 
other investigators. Kaulfers,! for example, found el achievement 
from .35 to .52 between prognostic-test scores and achi 
Scores or teachers! marks. "m. 

It would thus appear that Some prognostic tests of mode 


5 B git subsea! 
language have not proved to be very effective in preas Se that W! » 
standings in the language in question. One must remembe tt! 
correlation of .60 a test’s forecastin 


t 
g efficiency is only 20 per ws 
than chance. Prognostic tests, however, can be used along 5 the st 
other factors as confirmatory evidence for or against taking Ш 
of a foreign language. 


eig? 
je? 


f 
LIST OF LATIN TESTS к 


ine, Y$ Ja- 

" minim’: | isle 

est, elemen- edge of masculine aid, UE : and 

irst 3 semes- cases, vocabulary, vero s, singu ave 
llege. Forms tion, English derivatives; 


zs test AMC ^ ont 
Б, 15 minutes; plural. Correlation of geo ye i e and 
vocabulary 10 minutes; grammar 15 age of teachers’ marks ice rlean Со" 
minutes; also total score, Percentile — tests: 80. Authors: Jacob 5: k 
norms for high school an 


Jd Boo 

d college. Michael Solomon. Wor 0% 

Reliability: З : George A. Yonkers, N. Y. ol 
Land, Cooperati i 


in Tesh 5 
rative Test Service, New 4. 4. Cooperative er 2 уса р 
York. level, high school and fir 25.80 
2. Cooperative Latin Test, advanced 
form, revised 


ап; 

college; higher level, a s. TH 

Series, High schoo] and in high school, 1942. Fort у. King i 

college. Forms P, Q, and R. Read- minutes. Authors: Harol 
ing, 15 minutes; vocabul: 


=“ vic? 
агу, 10 minutes; Geraldine Spaulding. NO d d 

grammar, 15 minutes also total Score.  scaleq scores. Cooperative igh 

Percentile norms for high school and 

college. Relianility: 94, 


pif 
New York. tin Tes 936, 
5. Kansas First Year T ente gor; 
School, first and second Test ‚ рот) 
Two forms. Two levels. est 2, 400 
A and B, first semester; "Time: 4 Јо? 
Cand D, second semester! ice m 
minutes, Authors: Си Shr т 
Lois Bellinger, апі Н. Measure £ 
Bureau of Educational Colles i 
ansas State Teachers у 
poria, Kans. 


Correlation 
nd total Score: 


John C. Kirtland. Co 
Service, New York. 

3. Orleans-Solomon Latin Prognosis 
Test, high school and college. 1926. 
Seven lessons in Latin, include knowl- 


› 
Operative Test 


MEASUREMENT OF FOREIGN LANGUAGES 221 


EVALUATION OF TESTS OF FOREIGN LANGUAGES 


eee of the criticisms leveled at French and other foreign-language 
кч an earlier date have been met. The selection of words for the 
ee ary tests. has been improved, errors of fact have been elimi- 
pre and questions have been arranged in the order of difficulty. The 
dila m that New England and New York norms might not be suit- 
jns es the rest of the country has been met by constructing norms 
Middle public secondary schools of the South, of the East, West, and 
End e West and for the independent secondary schools of New 
үә Present-day criticism revolves around (1) the test forms, 
e content of our best tests, and (3) the omissions. 

t Presenteday evaluation of foreign-language tests is concerned about 
very form of the objective test itself. The critics hold that the 
poe of recognition of the correct answer out of five alternatives is 
К igs process quite different from actual recall of a word in a trans- 
in lon situation. Moreover, in such definitions of words only one mean- 
E is used, while the essence of language rests in the variety of meanings 
Word can convey according to the context. A quotation here from 


enmon answers this objection: “The reply is that the recognition 
in the same length of time, that 


свой gives more pupil response А 
пр is easier and more objective; and that while the absolute scores 
s the completion or recall method are considerably lower, the corre- 
be ion between results of this with those attained by the recognition 
method are almost as high as the reliabilities of either technique.’ 
ter сепсе for this last statement is furnished in German vocabulary 
ts in which the reliabilities varied from .89 to .94 and the correlations 
€tween д completion test and a five-response recognition test varied 


Тота .81 to 87. 
d -Oreign-language tests are subject to other shortcomings. It is 
med, for example, that the selections for translation are entirely 
also fearful that the presen- 


too ен 
Short а t unified. Critics are 
D ay affect the students’ learning, for they 


atio, 
n of f m 
o answers ен 
ur wrong orrect forms. Other critics are sure that 


Shoul d 
c 
hear and see only бв ог that pronouns get only a cava- 


Verb f 

i or : sampled, 

lier t паха inadequately Yd too, that since the tests have to do 
s f the written language, 


wit, atment. They are fearf 
inn the learning A He structure and meaning of the w 
ching will be strongly influenced in the same direction. 
Cong, second of evaluators are not satisfied with € the tests 
чай, Тһеу [ede the omission of tests of conversation and pronun- 
sis in the Modern 
онон committees on 
Шап Company: 192 


Foreign Languages, Publica- 


1 
Modern Languages, Vol. V, 


_ 

tions eon, V, A. C., Achien 

P. 19, the American and C 
ew York: The Macmi 


PROBLEMS OF MEASUREMENT 
222 


bout 
z о donea 
iation. They think, too, that certainly something should va language 
ni iden buda effect to the vernacular from the foreig 

testing t 


SUMMARY nd 


a 

ite, speak з) 

€5—(1) ability to аа ар ht E 
ability to appreciate its lite Q 


cte yom 
€ research and have cos E dinf 
ardized tests in the foreign-la ea". i 
w’ 
с c 0 
- Cooperative tests in M was FÉ 
en presented and illustrated. 


tests had high 
rate tests for 


i еї 

і advanced students, and р re p i 

for both high School and College. These tests even go so far 45 “ate 
norms for different types of se th 

different areas of the United States, think 0 

In spite of these excellencies many thoughtful teachers aci 


; ; а a 
the form in which the test is ‘cons: Tucted, tests only the С a UC of 
recognize the right answer, à me 


one M 
ntal process very different etude? ip 
lation. Some of them think that ү € presentation to ee tb 
t; while others empha! J 
portance of aural tests, -—— t 
It was pointed out that Many of the desirable objectives 


MEASUREMENT OF FOREIGN LANGUAGES 


223 


Ing of foreign languages have had as yet no satisfactory standardized 


tests constructed. 


QUESTIONS AND EXERCISES 


E 1. Describe the four objectives usu- 
atta hae for in the teaching of any 
ee language. Which one of these 
ve i 
met co susceptible to measure- 
s eiat did President Butler imply 
« Calling teachers of foreign languages 
ambassadors”? 
аа features are usually in- 
abl ed in a good French test? How reli- 
€ I$ it? How valid? 
TM What sources of information of a 
con rch nature are available for test 
Structors in French? 
E Ts the selection of the meaning of 
p word from five alternatives the 
evide as translating it? What was the 
nce offered by Professor Henmon 


bearing on this point? Do you think 
that Henmon's evidence answered the 
question? 

6. If a person can translate a short 
passage well, can he also translate a long 
passage well? 

7. Why is it difficult to construct a 
satisfactory aural test? One of life and 
culture? 

8. Compare the French tests with 
the German and Spanish ones. Are there 
any differences in test construction? 

9. What are the salient characteris- 
tics of а prognostic test? Describe one 
such test. 

10. What are the means available for 
advising a student about taking Latin? 


BIBLIOGRAPHY 


9595, Oscar KmrsEN (ed.): The 

Жейу. Forty Mental. Measurements 

Park p Items 1340-1375. Highland 

Veo? N.J.: The Mental Measurements 
Parbook, 1941, 

: The Third Mental Measure- 


Mm 
i. Yearbook, Items 178-213. New 
Press, dd N.J.: Rutgers University 


Jong ENE, Harry A, ALBERT N. 
Bra SEN, and RAYMOND GER- 
the lt Measurement and Evaluation in 
York ondary School, Chap. XVI, New 
Tangy orans Green & Co., 1943. 
mep, 000065 of The Cooperative A chieve- 
ests. New York: Cooperative 
ervice, 
Qu, KES, HerserT E., Е. F. LIND- 
Consi, and C. R. Mann (eds): The 
ат, ae and Use of Achievement Ex- 
ton Mine” Chap. VI. Boston: Hough- 
m Company, 1936. 
BOR C. W.: Educational Measure- 
К: ? High School, Chap. VL New 
Ppleton-Century-Crofts, Inc» 


Test 


Peters, Exma: “Relation of Tests 
to Improvement of Instruction,” Classi- 


cal Journal (1932) 28:187—196. 
Publications of the American and 


Canadian Committees on Modern Lan- 
guages. New York: The Macmillan 
Company, 1929. 
Bucuanan, MILTON A.: A Graded 
Spanish Workbook, Vol. III. 
Cueyp evr, F. D.: French Idiom List, 
Vol. XVI. 
Haucu, EDWARD F.: German Idiom 


List, Vol. X. й 
HENMON, V. A. С.: Achievement Tests 


in the Modern Foreign Languages, 


Vol. V. À 
KENISTON, HAYWARD: Spanish Idiom 


List, Vol. XI. 
MoncAN, B. Q.: German. Frequency 


Workbook, Vol. IX. 
VANDER ВЕКЕ, GEORGE E.: French 


Work Book, Vol. XV. 
Rucu, G. M., and СкокбЕ D. Sron- 
parp: Tests and Measurements in High 
‘School Instruction, Chap. VIII. Yonkers, 


224 PROBLEMS OF MEASUREMENT 


N.Y.: World Book Company, 1927. 
Semert, Louise C., and Euntce В. 
GODDARD: “Тһе Use of Achievement 
Tests in Sectioning Students,” Modern 
Language Journal (1934) 18:289-298. 
Symonps, P. M.: Measurement in 
Secondary Education, Chap. VIII. New 
York: The Macmillan Company, 1927. 
: “A Foreign Language Prog- 


cord 
nostic Test,” Teachers College Re 


1930) 31:540-556. 

i ед ARTHUR Е.: ri 

Guidance, pp. 81-84. New York: 

& Brothers, 1945. 
WRIGHTSTONE, J. WAYNE: 

ing Diverse Objectives and А | 

in Latin," Classical Journa 

34:155-165. 


— 


CHAPTER 9 


Measurement of M athematics 


IMPORTANCE OF MATHEMATICS IN OUR MODERN WORLD 
, At no time in the history of the world has the importance of quantity, 
timing, and precision been more clearly demonstrated and more fully 
recognized than during the Second World War and since that time. 

athematics is the indispensable tool of precision in measures involving 
Quantity and time. The natura ost of their progress to 

е use of measurement. Their slogan has been "Unless a thing is 


m 5 н 
easured its nature remains unknown. 


Matic applications of mathematics in recent years have occurred in the 
areas of the social sciences. The outstanding tool in the quantification 
of the social sciences has been statistics. Furthermore, even betting odds 
àre now calculated with mathematical nicety: Mathematics, then, justi- 
205 its place in school as ап introduction to science and scientific think- 
Ng as well as in the workaday activities of trade and commerce. 
TESTS OF MATHEMATICS IN THE ELEMENTARY SCHOOL 
OBJECTIVES IN TEACHING rn 

In active in teaching arithmetic is to aid 
Pupils to unies d — edm the quantitative aspects of daily 
Че, Tt eo E capacity to use our number system in making more 
Precise mea е € all kinds, in innumerable transactions involv- 
Ing money а d she interchange of goods, in the calculations of time and 
lstance in rà - struction of objects of all kinds, and in many other 
Situations x e ee lish this broad aim more specific objectives are 
Necessary: о accomp 


„Ъ To acquire ап understanding of the vocabulary used in quantita- 


tiy, tit I bol 
е think; en o the language of quantity such ѕут 0:5 as 
equal, se In айин m minutes, and seconds must be learned. 
р is on ч ат: ty to translate written descriptions of quantita- 
lve Pol: e capa ? omit e computations with numbers. 

ctions into ac he four fundamental 


d tely t 
о uickly and acura 5. PE ; 
derago n to pe e ч multiplication, and division with 
Whole imd < een common and decimal fractions, and denomi- 
ixed num , 


© numb 
ers. 225 


226 PROBLEMS OF MEASUREMENT 


В iness trans 

3. To gain a deeper and more precise understanding pem. t a 

actions involving such problems as interest on spa) nling, etta 

commissions and profits, taxation, school finances, н quantity 

translating general statements about them into ideas ү desc 

4. To acquire the ability to solve problems that me involves 
words or that arise in ordinary living. In some cases thi 


ple™ 
is of the P'o oct 
collections of facts bearing on a problem, the analysis of be corse 
the decision about the o 


manipulation of the pro 

5. To learn to unde 
arising in everyday livi 
precise. Among these 
and saving and of thri 


8 s nd t 
peration or operations to use, & 


а e 

u b 

and 9. A committee first m Dar 
cs at this level. “These objec four P 

: :ng to the а 
categories corresponding ке 

of the test: (1) mathematical skills, (2) mathematical facts» © 


ciat1o 
applications, and (4) appre 
nature and value of таћетаќісѕ 1 


OL 
сно 
SURVEY Trsrs FOR USE IN тнк ELEMENTARY 5 


1005 , $ 
sectio pit 
All general test batteries for the elementary school have ha 
both the fund 


e p 
ec us al 
ndamentals and the problems of arithmetic nner 95 p 
in arithmetic are arranged in an increasingly complex m? ls is Pio? 
ing Progresses, the covera x etro. 
ere are two samples: (1) x Tests айй 
ests, and (2) the California Achieveme {ету е0 
Тһе Metropolitan Achievement Tests, intermediate De and de 
4, 5, and 6), has One sectio 


А als inc у 
А : n on arithmetic fundament: cntals P of 
пш Problems, The Section on arithmetic fundam whole op! 
the addition, subtraction, multiplication, and division V thes? 
bers, common fractions, and decimal fractions. In each 


MEASUREMENT [03 MATHEMATICS 227 


introducing mixed numbers and by requiring the subtraction of frac- 
Decimal fractions increase in difh- 


tions with different denominators. 
culty to such examples as .003) 10156. A few of the 60 examples deal 
with percentage and a few with the addition and subtraction of de- 
nominate numbers. The section on problems in arithmetic deals with 
a variety of written problems, only a few of which have grown out of 
the actual experiences of children. A few samples of problems which 
might grow out of a child's experiences are (1) the calculation of the 
number of boxes that would be needed if a girl has 255 candles and 
puts five in a box; (2) the calculation of Sol’s earning at 40 cents an 
hour if he works from 8:30 to 11:00 and from 2:30 to 3:30; and (3) the 
distance club members can walk between 8:30 and noon if they walk 
2M miles an hour. Sample problems are concerned with the computa- 
tion of the monthly income if the total yearly income is known, of the 


average monthly cost of gas if you know what the total cost per year is, 
needed if the dimensions of a 


and of the nu Äre fencing 
mber of feet of wire encing 
field are known. There are 40 problems. If we check the Metropolitan 
Arithmetic Т Е ainst the aims and objectives 1n teaching arithmetic 
est ag ectives described. There is 


We f the obj 
que pood sores Red ү the vocabulary and symbols used 
fluenced by adult needs 


10 separate section for the testing 


M arithmeti plems, too, 216 more in ; i 
than p ny bees ipei о children. There is no special technique al- 
y the experienc Me diagnosis. À teacher, though, may 


Lnd worked out for purP 
SiS ei understa 
is paper. ] к 
On i i California 1 t i 
given, Fleer je fundamentals. It MT aa Am 
Science, and lit Mns an therefore Can, give à muc um. ар 
treatme "ex tals. It is divided into four teve s; primary 
ment of these fundam , and 6), intermediate 


rades 4; 
edes 1, 2, 3, and 4), elementary (P and college). For our pur- 


Brades 7 8 dvance' А В 
‚ 8, and 9), adv? and intermediate batteries. 
Poses we shall describe the elementa y and 6) is made up of seven 


nding of & child's weaknesses by an analy- 


Achievement Tests, will be 


hi rades 4 ?: a 

sections Е, ж ee have 30 items concerned E ше an 

ng of u s first tW anne n arithmetic. It asks what two hun red 

six» UR and sym 5 ss, or “опе thousand two. It d the Don 

i cates in num Д of four numbers. 

d нет апі 3e x, and +, 9b. lb., Д d 
Ф zx + : 

ES about the meaning y сше problems which grow largely 
out of ollows a set of incre vp Following these problems are one 
Whole the experiences of € ; subtraction, one of multiplication, and 
9ne uas of nri or he arrangement, which is very convenient 

vision. Excep 


228 PROBLEMS OF MEASUREMENT 


n ula- 
for studying each child's strong points and difficulties, the ma ier 
tions required differ very little from those of the Metropolitan tic whi 
ment Test. Altogether there are 105 items dealing with pee reali t 
the Metropolitan has 100. The Metropolitan tests use 40 pro d worked 
California tests, 15. The California tests include a plan alrea y batter) 
out and keyed for the analysis of difficulties. The intermediate wind 
resembles the elementary battery in form but differs in the pent e; 
ways: the terms and written numbers are more difficult, for “mbet” 
three-eighths, DCC, and “a 56 b 34 626424. find the largest T^^. well 
The symbols to be known include the greatest common divisor sii oí 
as the formulas for measuring the volume of a prism and the enta 
a triangle. There are four pages of fundamentals as in the elem 


fer 
red T 
battery. Opportunity also exists for analyzing errors by keye g 
ences. This test does off 


zin 
er the teacher an opportunity for pei and 
child's results. The first two Parts are tests of mathematical ЖО шіп 
symbols. These two improvements make this a strong test for me 
arithmetic. 


(1 
| | +. are 
Other batteries which contain good tests of arithmetic jes of 
the Stanford Achievement Test, and (2) the Coordinated 
Attainment. 


SEPARATE TESTS FoR ARITHMETIC 
More comp 


als? 
М ics are 
1 lete tests entirely devoted to mathematic 
available. 


(5 

| temP 
The Cooperative Mathematics Test for Grades 7, 8, and 9 аБ д5 
to measure the four objecti i 


! = p 
tains addition, subtraction, multiplication, and division of wh pth 
bers, mixed numbers, co ; | 
mensuration, and soluti 


MEASUREMENT OF MATHEMATICS 229 
hexagon, hypotenuse, meter, etc. In Part III mathematical applications 
are made to percentage of school children promoted, miles on a speed- 
ometer, cost of gas, table of contents of a book, percentage of bone in 
meat, thickness of ice and number of people allowed to skate, etc. 
Part IV deals with the recognition of facts missing from a problem.! 


na trip. How many miles per gallon did he 


2. A motorist used 10 gallons of gasoline o 
eded to solve this problem is the 


average? The fact not given which is ne 
2-1 weather 

2-2 date 

2-3 time 

2-4 miles covered 

2-5 cost per gallon 


Other items are (1) ability to read a graph, (2) size of fractions, 
(3) the facts needed to find the area of the front of a house, and (4) the 
conclusion that can be drawn from a bar diagram setting forth Federal 
*Xpenditures for unemployment relief per year. Percentile norms are 
available for each part for grades 7, 8, and 9 based on the following: 


Grade N 
1 1,564 
8 2,241 
9 3,773 


The rel; d of its various parts was com- 
reliabili he total test an 
Puted tr — bert de children. The narrowness of the range of 
Опе grade reduces the reliabilities somewhat. This test has also more 
Value for predicting future success in algebra since the correlation be- 
Ween it and the Cooperative Elementary Algebra Test at the end of a 
year’ orted as - 5 
he pra of qp: urvey of accomplishment in 
Е o e У і 
arithmetic: (1) the Compass Survey ithmetic, and (2) Iowa 
Very-pupil Test of Basic Skills, Test 
DIAGNOSTIC Trsts IN ARIT 
Th . im to being diagnostic | c 
(1) ge tests ay сЕ Tests in Arithmetic, (2) the Diagnostic 
Test for гч Da rocesses in Arithmetic, and (3) the California 
Чы undamenta aie the Compass Diagnostic Tests in 
Mises T ests. Of the » comprehensive and complete in its cover- 
etic is by far the Ime bably the most efficient diagnostic 


е . A i ro А E Py 
ae arithmetic processes. Tt я This test is divided into 20 different 


; ес 
Desig анды A a pea panying table. Each test has about five 
wn 1 


ү ional Тез 
tem by permission of Educational 


HMETIC 
tests in arithmetic: 


ting Service, Princeton, N.J. 


230 PROBLEMS OF MEASUREMENT 


Compass DIAGNOSTIC TESTS IN ARITHMETIC 


Grades d Tests Contents 

2-8 27 I Addition of whole numbers 

2-8 18 II Subtraction of whole numbers 

ы) 8 31 III Multiplication of whole numbers 

RES 60 IV Division of whole numbers 
СЯ 50 V Addition of mixed numbers 

pid 40 VI Subtraction of mixed numbers 

5-8 30 VII Multiplication of mixed numbers E. 
5-8 40 VIII | Division of mixed numbers ion of decim? 
5—8 45 DX. Addition, multiplication, and subtractio 

6-8 40 X Division . umbers 
6-8 25 XI Addition and subtraction of denominate s numbe 
6-8 30 XII Multiplication and division of denominate 

7-8 54 XIII Mensuration 

6-8 38 XIV | Basic facts of percentage 

7-8 44 XV Interest and business forms :ehmetic 
4-8 25 XVI Definitions, rules, and vocabulary of агі! 

5-6 35 XVII | Problem analysis, elementary 

7-8 35 XVIII | Problem analysis, advanced 

5-6 20 XIX  |General problem scale, elementary 

7-8 20 XX General problem Scale, advanced 


pleteness of the facts Covered: 
Part 1. 70 basic addition facts 0 
Part 2. 66 higher decade addition fact igit? 
S ngle dig 
Part 3. 13 examples ranging from three to seven sing 0 
column addition D 


m 
Part 4. 13 ex, difficult column addition, f° 


to four-place numbers 
7 examples similar to th 


amples of more 


+ nea, pt 
is perhaps an ; quate treatment of arithmetic m тій, 
16 Problems used аге the traditional ones. Finally: u 47 
be emphasized that such а iagnostic test locates the errors tev?) 
Not arrive at the Cause of the difficulty, It merely shows 2 


400 
5 Work is UNsatisfactor hat al 
ae ee y. Bust 
It is Just at this point of understanding the cause of erro" usi e 
Diagnostic Test for Fundamen 


: Bur о 
tal Processes in Arithmetic DY, wh 
and John comes 


А 1 i 
Into the picture. This is an individual test 


MEASUREMENT OF MATHEMATICS 231 


administration a teacher sits down with a child and listens to him work 
aloud a carefully arranged set of examples. Its dominating purpose is 
to discover the reasons for the wrong habits. There are lists of types of 
errors which can easily be checked as the test proceeds. For example, 
in “addition” are listed: 
. Errors in combination 
- Counting 
Added carried number last 
. Forgot to add carried number 
. Repeated work after partly done 
Added carried number irregularly 
. Wrote number to be carried 
. Irregular procedure in column 
. Carried wrong number 
. Grouped two or more numbers 
and eighteen other errors. 

This test was the first to meas 
However, it does not distinguish c 
and faulty work habits. The sam 


© \о сор с\ел оо юк 


_ 


ure the thought patterns of children. 
learly between errors of computation 
ples, too, are at times too few for 


teal diagnosis. Some users have felt that the check list of errors is far 
from complete. This difficulty could be met by the teacher’s writing 

own an account of the occasional error which did not occur in the 
list. 


The third sample, the California Arithmetic Tests, already described 


under survey batteries, claims that it is a er ee a wo. 
Procedures by which errors can be identified for the а de “ ап 
Summarized for the class as а whole. This ке par x s P 
15 divided into (1) reasoning and (2) fandi | = E 7 е в as 
en corrected and the scores brought forward tot e i р graph 
may be made of the scores, of the grade ice Pas his diffi un аг 
Tank. If a child makes а poor record in and S. dif id в 
then studied and а diagnostic analysis made of his ker chara aa 
oth arithmetic reasoning and arithmetic fundamen р. Pb in e 
into parts and each part keyed to the problem ps yen à 
trates it. For example, addition is analyzed into the g: 


- Sample combinations 

. Bridging 

с Carrying 

Zeros 

- Column addition 

+ Adding money 

and Adding numerators 
five other parts. 


TIO л оюк 


232 PROBLEMS OF MEASUREMENT 


as а survey 

It would be a distinct gain if a test could be used a4 thou i 
and as a diagnostic test at the same time. There is e diagnosis : 
that the samples in this test are too few for a gena 7 scattered 
example, there are only eight problems in long input ой m 
through three levels; errors in adding numerators hia re їп denomi 
example; in adding fractions and decimals, on one ex ook decimals, 02 
nate numbers, on one example; in adding fractions мифе" const! 1 
one example. The test is also weak in describing its ш — si£ 
tion and perhaps in including such content as 7, the sq 
and so on. 


«ap the 
" ith © 

; rison W js- 
greatly in compa is one OF 


the claims of the tests m test 
the items of the usual н d aken 
as where little learning more i 
Tests will add something P^ pe 


р еп . 
mpass Diagnostic Tests will go Cy ifie 
than any of the above in getti i 


L of 
TESTS OF MATHEMATICS IN HIGH SCHOO i 
Tests suitable for testing the ob 
mathematics are described for 


dee 
a 
de of Prognostic tests | st for Gr: 
It is understood that the Cooperative Mathematics Te gol 
7, 8, and 9 is also suitable for testing in the junior high sc 
go 
OBJECTIVES IN ALGEBRA TEACHING 3 


t 
early y 
The objectives in the teaching of algebra illustrate ia B Ше 
theories of the teaching of mathematics. The pugne. tbe 10 at 
emphasize the understanding of the symbols used in alge ation d) 
ing of how to manipulate these symbols, the uses of the eq 
!W. A, 


Brownell, The 1 938 Mental 


5; 
z, Bur? 
Measurements Yearbook (Oscar K 
Item 893, New Brunswick, N.J.: Rut 


gers University Press, 1938. 


MEASUREMENT OF MATHEMATICS 233 


The other group speak continu- 
of drawing inferences, and of 
a to experiences which are now 


the understanding and use of graphs. 
ously of the process of generalization, 
exercising ingenuity in applying algebr: 
occurring or will be likely to occur. 
. A complete list of objectives must of necessity include the outcomes 
implied in the two theories just described. Breslich's list, for example, 
follows pretty closely the proponents of the first theory. According to 
him the major objectives аге: 


a. To understand the terminology of the algebra taught during 


the semester. 

b. To perform the fu 
taught. 
‚ To combine an 
. To derive equations from pr! 
. To solve equations. 
. To understand formulas. 
. To evaluate formulas. 
. To solve formulas for a given letter. 
. To translate verbal statements into formulas. 
. To understand graphical representation. 


J 
k. To use graphical representation. 
ives should be supplemented by the 


ndamental operations that have been 


d decompose simple algebraic expressions. 
oblems. 


оз че ёс 


‚э. 


It is quite evident that these object 
following: А 
1. The ability to draw inferences from algebraic data 
2. The capacity to generalize from mathematical facts presented. 
3. Ingenuity in applying mathematical techniques to practical 
Problems 
"A. The ability to Syn 
rough f thinking | 4 
5. The veracity to select data and bring it to bear on the solution of 
the pp 
о : 
ki Ds "1 . й problem to construct tests which measure 
&dequate] e > latter objectives: Tf a test is suitable ee gi 
€ first "d of objectives, then this fact ie eae A nm iet j^ 
» е “ап algebr , 
арз instead of describing a ee ipulative and mechanical aspects of 
h the subtitle. At any rate, 


escri e 2 
ИМА gg Эк ет ht appear m 

o inform the user that the 
as not contemplated. 


thesize and coordinate mathematical facts 


al s. er 


О: 
т descriptive statement algebra instruction W 
in Я 
in 5 of the whole area jen rineteon Forty Mental Measure Tentei, 
ltem 1455709 Oscar K. ca i The Mental Measur! Yearbook, „ВУ 
Permission. Highland Park, ^*'* 


ements 


234 PROBLEMS OF MEASUREMENT 
If these recommenda: 


++]с1515 
а critici 
tions were carried out, nine-tenths of the 
of these instruments 


would not be necessary. 


ALGEBRA TESTS 
Two algebra tests will 
Bureau Algebra "Tests, and 


divided into two levels: 
Test Time, minutes 
I. First ir M 80 
qud Revised, first МЕН, eis gris 100 par 
1 ms. 
Test I is divided into Part I Mechanics, and Part II, Proble fost of 
consists of 17 equations and 3 


x 
5.3 = 12 
11. 6x? — 13x46 = 9 
15. P= 24 39 


Y dollars pu 


iore 
On the next to the last ; 
the equations of the probl 


Test II is much like Те 


! Items by Permission of World Book Company, Yonkers, N.Y. 


MEASUREMENT OF MATHEMATICS 235 


mechanical and manipulative nature. um quem i -68 to 
-12 between the test о е constructed and later 

erative Algebra ich is divided 
re present there are many forms. ы xd by n Rab, 
into three parts. Two levels of ашу а Y ler one and Quad- 
Elementary Algebra through Quadratics, 1s Pi CON of the items of 
ratics and Beyond, the more comples e les and problems differ 
both tests is that of multiple pee a p^ usual algebra in both 
Mighty from those pers Ebmentary Algebra инша сна 
{оп and lettering arts. Part I contains 20 samples o АА ср 
5 div ided into three RPR the collection of terms, адо Ny. 
аак аай і of parentheses, solution of кено os з 
Кан numbers, та d treatment of exponents in multip fiue 
Ing fractional terms, [o is to do is made clear in each лде е 
E A $i Q which illustrate both the materials o 


ànd the technique used: 


5. The sum of —15c* and —3c* is 


5-1 — 18c! 
5-2 —12c4 
5-3 12 

5-4 12c 
5-5 —18c8 


"s i = he 
the gra у зн О e point (m, )t 

ч h of the equation 3x 5 1 passes through th 4 

: e grap. + 


value of m is 
19-1 234 
19-2 —7 
19-3 61« 
19-4 7 


" T o two of which are 
a blems, a 

II deals with the olution of 15 pr n 
equ ш. T d io ч pr р а і а o! козе used is shown 
ч tio d aks is called for. type of pr 

ations, and ri 


f 
E (liie 200 more girls than boys. The total number о: 
re 


4. In a certain high school there а Y boys are there? 


‘ y man: 
Pupils in the school is 1876. How 
4-1 638 


S i on, N.J. 

ms | al I ing vice. Princet › 

: Ite 5 by pe mission of Education: esti er Ч 
T: 


236 PROBLEMS OF MEASUREMENT 


vorth т56 з 
15. А dealer wishes to mix hazelnuts worth 50¢ a pound and cashews wo 


й ; pounds 9 
pound to obtain 10 Ibs. of mixed nuts worth 55é a pound. How many P 
cashews would he use? 


ation of 


Part III contains 28 items w pols 12“ 


formulas and equations which, 
stead of numbers, Two samples 


: * = soul 
hich involve algebraic rien oe 
for the most part, involve sy 
are: 

1 


9. If z = then s equals 


14. IfSx—7 = ex, then x equals 


с 
ма 52 


14-2 ed 


7 

14-3 $E 
=f 
14-4 i3 


S-c 
Mg 


one claims that the test is too mechanical, t jative " 
because it emphasizes too much the mechanics and manip" pil 
pects of algebra. The test is weak in its measurements of t А De 
to draw inferences from data and of the ingenuity needed 1? “to sy 2 
algebraic techniques to practi 


MEASUREMENT OF MATHEMATICS 237 


1. Does a large majority of authorities agree as to the importance of 
the objectives? 

2. Are the teachers actually striving to achieve these objectives? 

3. Is the objective clear, specific, and unambiguous? 

4. Is the objective capable of immediate attainment? 

In regard to the Cooperative Algebra Test it might be said that the 
authorities agree generally on the importance of the items. From an 
inspection of the algebra used, there is no doubt that the teachers are 
striving to achieve them. The clarity and specificity of the problems 
are beyond question and are immediately attainable. 

It may be concluded, therefore, that the test measures well what it 
sets out to measure and hence is valid for that purpose. It does not 
emphasize those higher outcomes of thinking, inference, and application. 

erhaps instead of calling it the Cooperative Algebra Test it could be 
More properly called “the Cooperative Algebra Test of the Mechanical 
and Manipulative Aspects of Algebra.” 
_ The second criticism, possibly not so significant, is aimed at the form 
in which the items of this test are cast—the multiple-choice form. 
Mathematicians object to the guessing involved. They say that it 
Оез not test computational accuracy ог the capacity to select and 
Synthesize data or to coordinate thought. The authors of the Coopera- 
tive Algebra Test realized this difficulty. They actually introduced 
answers which would mislead superficial inspection. These answers ap- 
Pear plausible to those who are deficient in the very ability which is 
eing tested. It cannot be denied that this objection is to a certain 
extent valid, but it must be balanced against the ease of scoring which 
is inherent in the multiple-choice form. 
GEOMETRY TESTS 

asis characterize the objectives of 
One group of students would 
material in plane geometry. 
hose theorems which have practical 


i hnically. 
yri who will not use geometry tec 
Я җе) fop those due of pee Applications of the rigor of proof 
exemplified in s Da they would have applied to the problems of 
geo 


th at good argument is like. Transfer of 
€ day until the student knew m: ime of instruction could be ex- 


ining ; der such 2 regime, 
Pected a in geometry е other group believes that the жле eri 
Probleme take phos textbook are, after all, what can be aa ' Е 
Would se е the папе F { meaning as possible, but they would not de- 
‘Se ave this as full o Schnell in The Nineteen Forty Mental Measurements 
* review by Leroy N. 
балоо op. cit., Item 1467. 


emphi 
th algebra. 
extbook 


The same differences of | 
Beometry as was the case W! 
delete large areas of the usual t 

S members would keep only t 


238 PROBLEMS OF MEASUREMENT 


ith the 
lete great areas. This latter group would agree pretty jd tics: 
objectives developed by Lide in his book, Instruction in Ma 


elopment of logical reasoning ability. of 
^ (Sep cma of ы appreciation of the utility and beauty 
rical forms. | 763 
Ио теи of the student with the properties, mens" 
tion, and relationships of common geometric forms. к. de- 
4. Development of an understanding and an appreciatior 
ductive proofs. lations- 
5. Creation of an understanding of spatial concepts and re 
6. Establishment of habits of precision and accuracy. has 


7. Development of an appreciation of the part geometry 
played in the history of civilization. 


tives set 
secti 
Only a few tests are able to measure the rich variety of objec 
forth above. 


measures as man 


. . . р 
geometric principles and facts. х uitable P. 
This test, developed for use in the United States Army, 15 ying dm 

the tenth grade and consu 


: 2 
circle, equality and similarity of triangles, the rhombus, an ob а 
gon. Fifteen minutes аге used in administration Sample pF 

from Form Q: arf 


!Lide, Edwin S., Instruction in Mat 
Education, Monograph No. 23 (0.5, 0 
Washington, D.C.: Government Prin 
2 Educational Testing Service, Pri; 
з Items by permission of Educati 


Seco? 
hematics, National Survey 1982, LE 
ffice of Education, Bulletin 
ting Office, 1933. 
nceton, N.J. NJ. 
onal Testing Service, Princeton; 


MEASUREMENT OF MATHEMATICS 239 


Е 


5. "mu D F 


4 8 
If triangle ABC is similar to triangle DEF, and if AC = 4, DF = 8 and 
EF = 5, then BC equals 
5-11 
52 214 
5-3 6% 
5-4 10 
5-5 Solution impossible 
20. Two parallel chords of a cir 
between them is 12 inches. The 
20-1 10 inches 
20-2 20 inches 
^ 20-3 6 inches 
20-4 8 inches 
20-5 5 inches 


cle are each 16 inches in length; the distance 
radius of the circle is 


i i for taking, consists of 15 
Part I hich also requires 15 minutes s of 15 
Ce dem complicated than those in Part II. They deal with cir 


cles, triangles, and parallelograms. 


8 


A 


Given angle A = angle D 


e F by proving from the given facts that 


Angle C can be proved equal to Angl 
€ two triangles are 
5-1 similar 
"2 congruent 
"$ equiangular 
-4 equilateral 
“> equal in area 


240 PROBLEMS OF MEASUREMENT 


D с 


-— 


4 8 


or the 
y little the nature of "pr spec 
tuations it does test well len e 
sual type of geometry inst a di x 
false form used in this test i5 "5, t 


.cho of 
ne must not forget, however, that multiple" d 
Swers can be machine-scored and in this manner inordinate eed 
time saved. ither ©? ye? 
Many of the transfer values of geometry have been ES in 106 
upon nor clearly defined, Until they are both, measuremen of 
will be hindered. Some math 


Steps in the proof 


тё, 
stifles ingenuity, 0 
Because the nat 


MEASUREMENT OF MATHEMATICS 241 


length of time he intends to stay in school. In helping the child make 
such a decision the counselor needs all the information which can be 


collected. To help answer such a question there have been developed 


prognostic tests which help to foretell a child's standing in algebra or 
hesy only in a moderate degree an- 


geometry. Because these tests prop 
ticipated scores or marks, they must of necessity be used only as added 
information. If the prognostic test, achievement tests in arithmetic, and 
intelligence test agree with school marks, then the prognosis is less likely 
to be incorrect. 

Prognostic Tests in Algebra 


Two types of prognostic tests have been constructed. One of these is 


more dependent upon what the pupil has learned; the other, on his 
capacity to learn material similar to what the course actually contains. 
The Lee Test of Algebraic Ability illustrates the first; the Orleans Alge- 
bra Prognosis, the secgnd. Let us look more closely at the latter. This 
test by Orleans is divided into a test on arithmetic and into 12 other 
Parts: (1) substitution in monomials, (2) use of exponents, (3) meaning 
of exponents, (4) substitution in monomials with exponents, (5) sub- 
Stitution in binomials with exponents, (6) like and unlike terms, (7) 


Tepresentation of relations, (8) representation of expressions, (9) posi- 
blems, (11) addition of like terms, 


tive and negative numbers, (10) pro с 
апа (12) summary test. Each part except No. 12 contains both a lesson 
апа a test thereon. Let us look at Lesson 3. There are seven illustrations 
of how to deal with exponents. Item 3 reads, “a? means a times a. If 
a = 3, then a? means 3? or 3 X 3, which equals 9." In the test you are 
advised that you may look back to Lesson 3 if you need to. Item 4 of 
est 3 is “What does с? mean?" Lesson 7 has 9 items on how to use 
Positive and negative numbers. Item 4 in this lesson says, “ — 12 fol- 


lowed b { 12 followed by а gain of 3, which results 
y +3 means a loss © 3 = —9." Item 4 in Test 9 is 


in a net loss of 9. This is written — 2+ 
—10 — 2.” The fundamental question 1s how much algebra a student 
сап learn in a defined amount of time (81 minutes). . 
Тһе validity of this test has been determined by measuring the 
Prophecy as obtained from the prognosis test against achievement 
“4 months later as measured by а standardized achievement test of 
Бега, In one case this correlation was 82; in another, 71. Since the 
test is now appreciably longer than when these computations were 
ped, the rn believe the coefficient to be about .80 on the average. 
9 not forget that а correlation of .80 means an efficiency only 40 per 


b. better than chance. 

Beometr imil: rogn 

О y, similar prog 
leans, by Lee, and by the I 


ostic tests have been constructed by the 


owa authors. 


242 PROBLEMS OF MEASUREMENT 


pat- 
: . e 
Tests of these arithmetic Processes are furnished in the survey ially 
teries which test 


] analysis of еасһ individual's strong а at al 
points. These batteries i 


el there аге available tests of ocn Д 
Plane and solid), and ігійопоте ру 


а " 
e uset 


s h of 
nature of the tests be furnished н o e 
emphasize the natur s 


LIST ОЕ MATHEMATICAL TESTS ervi” 

І. АвїтнмЕтїс Tests ` Rose E, Lutz. Cooperative = inf 

Survey New York. е foll gt 
1. The Cooperative 


Math 2. Arithmetic үе Шш смех 

€matics batteries: (а) Stanfor chieV? st?! 
Test for Grades 7, 8, and 9. 1940, Se litan А T% 
forms. Time: 80 1 weral Tests, (0) Metropoli 


minutes, Reliability; ests, (c) California Achieveme?, uai 
.92. Authors: Alice H. Darnell, John C. and (a) Coordinated Scales 
Flanagan, Stevenson W. Fletcher, and 


ment Tests, 


MEASUREMENT OF MATHEMATICS 


aj Anio ical Scales of Attainment in 
bu et REB, Т 1933. 
min orms. Three levels. Time: 80 
Donc , Authors: L. J. Brueckner, 
nen Yd Kellogg, and M. J. Van Wage- 
anol; ucational Test Bureau, Minne- 
Polis, Minn, 
поз Сотразэ Survey Tests in Arith- 
wo 1 grades 2-8. 1927. Two forms. 
grad evels. Grades 2-4, 25 minutes; 
re es 4-8, 35 minutes. Authors: Н. A. 
: Ра B. Knight, С. М. Ruch, and 
onis Studebaker. Scott, Foresman & 
pany, Chicago. 
ve Dasic Arithmetic Skills, Towa 
editi PUPIL Tests of Basic Skills. New 
a pu 1940, 1945. Forms L, M, N, 
t Two levels. Elementary battery, 
atte: 3-5, 57 (65) minutes. Advanced 
orn: Ty, grades 5-9, 68 (80) minutes. 
Spitz 9 machine scored. Author: N. F. 
те er aided by Ernest Horn, Н. А. 
i спе, and E, F. Lindquist. Houghton 
in Company, Boston. 


Diagn ostic Tests 


1, 
Ан Сотравз Diagnostic Tests 
20 ,. metic, grades 2-8. 1925. Опе form. 
8 8 "ime for each part ranges from 
. > 54 minutes. Authors: С. M. Ruch, 
Stude Knight, Н. A. Greene, and J. W. 
Pan: ebaker, Scott, Foresman & Com- 
е Chicago. 
т  D'iagnostic Test for Fun 
19 esses in Arithmetic, grades 2-8. 
Onti An individual test. Two forms. 
G. timed (about 20 minutes). Authors: 
Seo, Duswelt and Lenore John. Public 
ton, m Publishing Company, Blooming- 
3. Cas 
1939 California Arithmetic Tests. 1933- 
battery o forms, Three levels. Primary 
tary p grades 2—3, 50 minutes; (ши 
Inte. attery, grades 4-6, 60 TS e 


i media, ades 
mi; е battery, 81 des 9-14, 


68 14665; advanced battery, grades 7. 
ang utes. Authors: Ernest W. Тіевэ 
Bur ilis W, Clark. California Test 
сац, Los Angeles, California. ; 
lagnostic Tests in Arithmetic 
mentals, grades 2-6. 1945 One 


in 


damental 


Funda 


243 


form. Five levels. Different material for 
each grade. grade 2, addition and sub- 
traction, 87 (110) minutes; grade 3, 
addition, subtraction, and multiplica- 
tion, 73-95 minutes; grade 4, Part 1, 
addition and subtraction, 100 (120) 
minutes; grade 4, Part 2, multiplication 
and division, 90 (110) minutes; grade 5, 
Part 1, addition, subtraction, multipli- 
cation, and division, 80 (100) minutes; 
grade 5, Part 2, fractions (addition and 
subtraction), 90 (110) minutes; grade 6, 
Part 1, addition, subtraction, multiplica- 
tion, division, 60 (75) minutes; grade 6, 
Part 2, fractions and decimals, 75 (95) 
minutes. Authors: Department of Edu- 
cational Research, Ontario College of 
Education, University of Toronto. 

5. Hundred Problem Arithmetic Test, 
grades 7-12. 1926-1944. Forms V and 
W. Time: 40 (45) minutes. Authors: 
Raleigh Schorling, John R. Clark, and 
Mary A. Potter. World Book Company, 


Yonkers, N.Y. 


П. ALGEBRA TESTS 


1. Breslich Algebra Survey Test, high 
school. 1930-1931. First semester, 41 
minutes; second semester, 52 minutes. 
Author: E. R. Breslich. Public School 
Publishing Company, Bloomington, Ill. 

2. Columbia Research Bureau Alge- 
bra Test, grades 9 or 13. 1927-1933. Two 
forms. Two levels. Test 1, first semester, 

rade 9 or 13. 80 minutes. Test 2, re- 
vised, first year, grades 9-14, 100 min- 
utes. Authors: Arthur S. Otis and Ben D. 
Wood. World Book Company, Yonkers, 
NOV. 
3. Snader General Mathematics Test, 
grade 9. 1951. Two forms, Ам and Bm. 
Time: 40 minutes. Reliability: .80 and 
.84. Norms for end of year based on 
2,190 students in 22 states, C.A. 15-4, 
1.0. 98. Arithmetic 42 per cent; informal 
geometry, 23 per cent; graphic repre- 
sentation, 8 per cent; algebra, 25 per 
cent; numerical trigonometry, 2 per 
cent. Evaluation and Adjustment Series 
edited by Walter N. Durost. World 
Book Company, Yonkers, N.Y. 


244 PROBLEMS OF 


4. Cooperative Algebra Test, Elemen- 
tary Algebra through Quadratics, re- 
vised series, high school. 1937-1943. 
Forms Q, R, S, and T. Machine scored 
but separate answer sheets need not be 
used. Time: 40 (45) minutes. Scaled 
Scores are provided. Authors: John A. 
Long, L. P. Siceloff, Leone E. Cheshire, 
Margaret P. Martin, and Marion F. 
Shaycoft. Cooperative Test Service, 
New York. 

5. Cooperative Intermediate Algebra 
Test, Quadratics and Beyond, revised 
series, high school, 1941-1943. Forms 
R, S, and T. Time: 40 minutes. Authors: 
John A. Long, L. P. Siceloff, Leone E. 
Cheshire, and Marion F. Shaycoft. 
Cooperative Test Service, New York, 

6. Lankton First Year Algebra Test, 
high school. 1951. Forms АМ and BM. 
Time: 40 minutes. Reliability: .84 and 
87. Percentile norms based on 3,183 
students from 22 states. Median С.А. of 
Students 15-1, median 1.0. 106. Simple 
operations, formulas, equations, graphs, 
problem solving. Evaluation and Ad- 
justment Series edited by Walter N. 
Durost. World Book Company, Yon- 
kers, N.Y. 


7. Iowa Every-pupil Test in Ninth 
Year Algebra, high school 


- New form 
each May. Time: 55 minutes. Author: 
- Vernon Price, Bureau of Educational 


Research and Service, State University 
of Iowa, Iowa City. 
ПІ. Acutevey 


TENT TESTS IN PLAN 
Soup Gromer 


TRY AND TRIGON 
Plane Geometry 


E AND 


TL. Cooperative Plane 


| Geometry Т t 
high school. 1933-1949 bis 


- Forms at pres- 
ime: 40 minutes. 


tence in 
1951. T 


MEASUREMENT 0 
100; grade 17 
Grade 9, С.А. 15-2, LQ, Mls CA. Й 
С.А. 16, 1.0. 102; p 10.16 
1.0. 103; grade 12, С.А. йет of d 
Consumer problems, pro» investme ы 
insurance, changing money pos val by 
bonds, banking, Bug edited E 
tion and Adjustment 9e Book Co 
Walter N. Du W or 
, Yonkers, N.Y. Achiet s 
P Orleans Plane Geometry at form 
ment Test. 1929. Two caper ond 
Test 1, for first REDIERE or se ү 
I and II except loci; Tes LIV p fa 
Senlester, COVENS Books rms ?. ilit 
and loci. Percentile hls еба? op? 
wp! ws qos 2 “authors: Teon 
+ a * 
eo J. S. Orleans. World 
pany, Yonkers, N.Y. Geomet?) 
4. Shaycoft Plane 3 АМ ап, per 
high school. 1951. Form lity? 8 
Time: 40 minutes. Relia 4 stud 110. 
centile norms based Jc І. 
24 states. Median Ce ivnthetit 
Adds analytic Mg У . 
and indirect proof ( 
attempt at pn crie 
ation and Adjustment rid po? 
Walter N. Durost. Wo Я 
pany, Yonkers, N.Y- p Вие", 
5. Columbia nk: 0 
Geometry Test. vl 
forms. Working time: as: 99: ^y 
ability between two ym pen Р NY 
Herbert E. Hawkes an ronker’ 
World Book Company etry 
6. Iowa Plane i igh 5€? 
Test, revised edition, ^ e 
1942. One form. Time: h а me 
Machine-scored, or af [m 
Swer sheets need not Ww. Brace cor 
Н. A. Greene and Н. ci and ci 
of Educational Red owe 
State University of low 
А 
ett д " 
id Сеот af ig 
1. Cooperative Solid Forms t v 
high school. 1932-1938. “© s Papi 
Scaled norms. Percent! S etr gno 
school classes in solid 8° 1% 
40 minutes. Authors: 


Solid Geometry 


MEASUREMENT OF MATHEMATICS 


John A. Long, and L. P. Siceloff. Co- 
operative Test Service, New York. 


Trigonometry 


"^ реше Trigonometry Test, 
© n grades 11-15. 1928-1930. Forms 
e ,and U. Time: 40 minutes. Scaled 
в рет scores provided for high 
i and college classes in trigonome- 
Sid qus John A. Long and L. P. 
York, . Cooperative Test Service, New 


i 
V. Procnosric Tests IN GEOMETRY 


Tet (oa Plane Geometry Aptitude 
ne high school. 1935. One form. Time: 
dup nee Correlation between apti- 
achie test and a 90-minute objective 
t еа test: .70. Correlation with 
emere ee of first-semester and second- 
Auth ster school marks combined: .59. 
hors: Harry A. Greene and Harold 
a ara Bureau of Educational Re- 
ou. and Service, State University of 
уй, Iowa City. 
high пез Test of Geometric Aptitude, 
iu 1931. One form. Time: 31 
is s Median correlation. between 
achie est of geometric aptitude and 
tion рец ѕсоге: .765. Correla- 
mark between aptitude test and school 
Author 53. Reliability: .81 W, 107). 
ee ors: Doris M. Lee and A. Murray 
Ang California Test Bureau, Los 
Beles, Calif. 


QUESTIONS A 


ae Describe the objectives 11 
Ing arithmetic. 
b. How far do you 
ХА are measured by ‹ 
the entals and problems contain 
Beneral test batteries? А 
Achin neue a copy of the California 
Visions 0206 Tests and study their pro- 
an instru diagnosis. Do you think su 
Pose EU is adequate for the Ру" 
of diagnosis? 
"escribe in some detail the Com- 
lagnostic Tests in Arithmetic- 
а detailed study of this instru- 


tea, 


Dass s 
Make 


245 


3. Orleans Geometry Prognosis Test, 
high school. 1929. One form. Time: 70 
minutes. Correlation between the prog- 
nostic battery and an achievement test: 
.73 (probably would be raised to .80 
with the present much-lengthened test). 
No reliability reported. Authors: Joseph 
B. and Jacob S. Orleans. World Book 
Company, Yonkers, N Y. 


V. Procnostic TESTS OF ALGEBRA 


1. Iowa Algebra Aptitude Test, grade 
9. 1931. One form. Correlation with 
single achievement test: .66 (N = 105). 
Probably more information needed 
about its construction and validation. 
Time: 35 minutes. Authors: Harry A. 
Greene and Alva H. Piper. Bureau of 
Educational Research and Service, State 
University of Iowa, Iowa City. 

2. Lee Test of Algebra Ability, grade 
9. 1930. One form. Time: 25 minutes. 
Correlation between this test and test of 
achievement: .71. Reliability: .93 (split- 
half method). Author: J. Murray Lee. 
Public School Publishing Company, 
Bloomington, Ill. 

3. Orleans Algebra Prognosis Test, 
grades 7-9. 1928-1932. One form. Time: 
81 minutes. Correlation with achieve- 
ment test at end of semester: .71 and .82 
(.80 estimated with present length). No 
reliability reported. Authors: Joseph B. 
and Jacob S. Orleans. World Book Com- 


pany, Bloomington, Ill. 


ND EXERCISES 


ment and conclude as to whether the 
author of this test was justified in call- 
* the most efficient diagnostic test 
constructed in any subject." 

4. What are the leading character- 
f the Buswell-John Diagnostic 
amental Processes in 
The Cooperative Mathe- 
for Grades 7, 8, and 9? 


ing it 


take algebra or geometry? . 
6. Compare in some detail the two 


points of view present in formulating 
objectives in algebra and geometry. 


246 


7. Summarize the leading character- 
istics of the Cooperative Mathematics 
Test for Grades 7, 8, and 9, u 

8. What are the characteristics of an 
excellent diagnostic test in arithmetic? 
What are the limitations of diagnosis in 
а survey test? 

9. Describe and 
types of objectives 
algebra. s 

10. What are the Characteristics of 
the Cooperative Algebra Test? Describe 


the criticisms leveled at this test and 
evaluate them. 


illustrate the two 
in the teaching of 


Books 


Buros, Oscar K, 
Forty Mental Meas 
Items 1431-1475. H 
The Mental Measu 
1947. 


(ed.): The Nineteen 
urements Yearbook 
ighland Park, N.J.: 
rements Yearbook, 


: The Thir 
ments Yearbook, 
Brunswick, N.J. 
Press, 1949, 

Bu: 
L 


d Mental M easure- 
Items 303-362. New 
> Rutgers University 


of 


Monographs 
Chicago, 1926. 


Соммѕѕтоҳ ON SECON] 
CunRICULU 


No. 30 


DARY SCHOOL 
M, PROGRESSIVE EDUCATION 

Maihematics in General 
Chap. XII. York: 
Appleton-Century-Crofts, Inc., 1940, 

The Cooperative Achievement Tests— 
A Handbook —1936. New York: Co- 
Operative Test Service. 


Я v. JORGENSEN 
and J. R. GERBERICH Mi easurement and 
Evaluation in he econdary School, 
hap. XVI I. New York: Longmans, 
Teen & Co., Inc, 1943, 
WKES, H. Б 
and С. R. Mann ae 


Boston: 

Pany, 1936, 
Liz, Enw S.: 1, Ustruction in, Mathe- 

matics, National Survey of Secondary 


BIBLIOGRAPHY 


PROBLEMS OF MEASUREMENT 


Í 
es 0 
sor outcom 
11. What are the major 0 


? the 
the teaching of geomet icisms b 
12. Evaluate А in со jn 
multiple-choice epo “ute p 
ing geometry tests. Wha Joes this 
the teaching of geometry © К 
а à Е 
nique fail to cope af p the 
. Discuss the „Лоаре 4 
iab in e calm How valu 
tests? | 
"M. What are the aide i 
nostic tests of mathe 
one of them. 


T 


23 Vemm 
No. N 
Education, Monograph ^; 2 


193 
, Hetin ©. 
Office of Education, SS ove 
17). Washington, a * 
Printing Office, S Vad!)s ning! 
LINDQUIST, E. F. P4 , Wa jon 
Measurement, Chaps- ilo 
D.C.: American Counc мей, 
1951. -— piducationdl + jr. N 
QW. ар. 
Eus School, os *roltSy a 
York: Appleton-Centu 
1930. : 
Svwowps, Р. ae 
Secondary БЫ a 
York: The Macmilla 


Articles 


м n 
" Arithmetic in ofthe 1 a [ics ae 
Sixteenth T Гаје" "rog 
cil of Teachers of ™ 
24 Bureau of Publi 


eral 


sity gjon i 


A ive cl 
College, Columbia oe б 2° 
Brcxzn, Ipa S.: Testin «n к? 
Standardization of а 175 the? 
try, unpublished pene 1934. у онх, iP 
State Teachers College, and Vom?" gi 
Cooke, Dennis Нэ je 


icti jon 
PEARSON: “Predicting 5 i 
Plane Geometry,” 0872" 78. 
an D А с 
Mathematics Pw ults of Fist jit 
Od in^ ot Ey 
Grover, zo success Junio ga 
pes pe Oakland . E 
ra in 
voe ournal of А 
ову (1932) 23:309-314- 


MEASUREMENT OF MATHEMATICS 


T. J. Murray, and Doris May 
T The Construction and Validation 
7 a Test of Geometric Aptitude,” 
athematics Teacher (1932) 25:193-203. 
р їкї, ]озЕРн B.: “A Study of 
lom of Probable Success in Alge- 
гон and in Geometry," Mathematics 
eacher (1934) 27:165-180, 225-246. 
—_ and P. M. Symonps: “The 
i omparative Reliabilities of Standard- 
ma and Teacher-made Achievement 
Ses When Given in the Middle of ш 
ar,” Journal of Educational Research 
(1933) 2527-138. 
дү ERR, Winona М.: “Prognosis of 
Abilities to Solve Exercises in Geome- 


247 


try,” Journal of Educational Psychology 
(1931) 22:604-609. 

Preer, A. H.: The Validity of Certain 
General and Special Tests for Prognosis 
in First Year Algebra, unpublished mas- 
ter’s thesis, State University of Iowa, 
1929. 

SEAGOE, May V.: “Prediction of 
Achievement in Elementary Algebra,” 
Journal of A pplied Psychology (1938) 22: 
493-503. 

Токсевѕох, Т. L., and GENEVA P. 
Алморт: “The Validity of Certain 
Prognostic Tests in Predicting Alge- 
braic Ability," Journal of Experimental 


Education (1933) 1 277-279. 


CHAPTER 10 


Measurement of Science 


1 
i 1 integ? 
Science and scientific thinking have come to form ашыш rH majo 
part of daily life that their understanding becomes np Jearning ory 
objectives of education. Probably in no other areas - 44 The e 
there more opportunities for application and quee ul of buy 
problems of living and breathing, of health and recrear a ad Png 
and selling, of transportation and social interaction ares d in 4 ree 
lems so profuse that great difficulty has been experience ge 
оп a unified course of study. К there are Ше 
Just as in the social Sciences so in the natural cignon tion int? of 
Problems of learning meaningful facts and their trans me solutio s 
patterns. The application of the scientific method to t tco 


; tant ОЧ 
everyday problems thus becomes one of the most impor 
of the educational process. 


TEACHING {5 
AIMS AND OBJECTIVES ОЕ SCIENCE TEACHI -— O 
Жы да Р Bsa бш сы a taai ita 
The objectives of Instruction in science are divided in deve 


A : бе, s the 
(1) the learning and understanding of scientific facts; (2) 
ment of the scientific method 1 


cqui"? 
+ H . i a 
1. Learning and understanding of facts and information 
the various sciences 


a. The discover 
b. Explaining 


d. Ability to read and u 


e. Mastery of the terms 
f. Skill in lab 


c 
m ials je? 
nderstand scientific pepe in 5 
and concepts peculiar to W qn 
+ 6 А 
Oratory techniques н 
1 These objectives 


orth ] 
parallel very closely but not exactly those АГ, 
€asurement of Understanding in Science," Chap. VI, Forty ffi ^ of 
Nalional Society for the Study of Education, Part I, “The Measure 
Standing.” Chicago: University of Chicago Press, 1946. 


248 


jl 
e 
på 


MEASUREMENT OF SCIENCE 249 
uthenticated sources of information 


g. Familiarity with well-a 
tructures and processes and to be 


h. Ability to name forms or $ 
a acquainted with their functions. 
- Developing the scientific method 

a. Making the proper qualifications when interpreting data 
(1) Staying within the limits of the facts presented 
(2) Using caution and reservation in the inferences drawn 
(3) Avoiding the influence of irrelevant facts 

b. Ability to interpret data, i.e., to recognize trends in data by 


Seeing common elements in diverse data 
lid cause-and-effect relationships 


Ability to identify va 
Ability to draw correct conclusions from scientific data 
e. Giving correct reasons which adequately support conclusions 
(1) Knowing and selecting the principle that applies to the 
situation 
(2) Avoiding the influence of irrelevant factors 
(3) Citing reliable authorities . | 
(4) Avoiding both popular misconceptions and the assumption 
of conclusions 
f. Ability to formulate hypotheses and to pla 
test them . 
8. Ability to identify the assumptions, W! 
which are necessary tO draw the conclusion. 
3. To develop in children habits of healthful living, \ 
habits of performing useful tasks and of applying scient 
| Ples in daily life 
< To develop in chi 
5 them and in scienc 
* To develop in child 
and of commonplac 


Ro 


n experiments to 
whether stated or not, 


which include 
ific princi- 


]dren interest in the scientific problems around 
e itself 
ren som 


e events W 


on of the beauties of nature 


e appreciati t 
; o easily taken for granted. 


hich are 5 


tion as to what aspects 
d by the instrument in 
for the testing of the 
hool and second, in 


of As we examine the tests We shall raise the ques 
their teaching aims and objectives are m? ie 
question, We shall first examine tests suitable 

Jectives of science teaching in the elementary 5С 
* high school. 

TESTS OF SCIENCE IN THE ELEMENTARY SCHOOL 
Tests "e ar in several of the achievement test batteries 
Suitable FS science ee of instruction in the elementary school. 
However r testing theo esas the California Achievement Test and the 
lowa Eve s such Dauer. ic Skills, which concentrate on the test- 
ery-pupil Tests 0 ts of science achievement. 


in 
8 of basic skills, there are 7? tes 


250 PROBLEMS OF MEASUREMENT 


SCIENCE Trsts IN TEST BATTERIES 
The science tests 


1 
Тав 8, Contents о> Science TESTS 


ae 
Average number of iten 


Jit 
trop?" і 
Coordinated Stanford M pieve еа 
Subject Scales of Achievement Test 
Attainment Test 1208 
ades 2) 
Grades | Grades Grades | Grades Ее P 
: 4- 
4-6 7-8 “4-6 7-8 BE a. 
s RR EI — | s8 2 
PM, аа е 13 5 12 11 9 
Health вав... x 8 5 , 1 | # 
ysics and astro," 5 15 13 
hemistry, Mi 4 16 6 10 4 
Geology апа Weather. = 2 11 4 9 1 1 
Physiology, X s 5 га 2 4 5 
Miscellaneous. NAE. | E z 5 4 6 1 52 
С ОНОЈ S 4 i 52 j 
S| 00 60 50 50 25° i? 
е re? gh 
indicates that as we is а d£ ai 
н Move y to gr. there 15 9 ; s th 
ше number of items in animale” еМ "and bealth Bn 005 : 
increase 1n items on Physics а: , d chemistry- or 
tion from each of the HU omy, and c Co 


am divisions of grade 5 of the 


о 
Scales of Attainment is Presented 
nted, 


now n 
" "Tu" : Min? 
Items by permission of Educational е oman, Minneapolis; M 


MEASUREMENT OF SCIENCE 251 


18. A gnawing animal that lives in the water is the 1 mouse 2 shrew 
3 muskrat 4 catfish. 

45. Plants may grow tall and pale indoors bec: 
3 light 4 soil. 

32. There are few school hall accidents if pupils 
3 hurry to classes 4 walk quietly. К 

58. The amount of electricity a lamp uses is measured in 1 kilowatts 
2 money 3 watts 4 volts. 


ause of lack of 1 water 2 air 


1 walk fast 2 play tag 


The following illustrations are from grade 8: 


38. The green material in plants helps them to 1 breathe 2 hold water 


3 make food 4 produce seeds. А . 
41. A body that has fallen from the sky to the earth is a 1 planetoid 2 
meteor 3 meteorite 4 nebula. : | | : 
53. A common chemical change is 1 rain falling 2 evaporation 3 air 


circulating 4 burning. 
i Table 8 
The Stanford Achievement Test, as can be seen from 8, 
covers about the same areas as do the Coordinated Scales of Attain- 
ment, but less extensively. The former emphasizes health habits some- 
What more in grades 4 to 6 and physics and astronomy somewhat less 
in grades 7 and 8. This Stanford Achievement Test uses only three 


Choices in its tests, which increases the chances of guessing. Illustra- 
tions for grades 4 to 6 appear below: 


19. The best cure for fatigue is—1 coffee 2 rest — 3 tobacco 1 : 3 


its—7 feelers 8 wings 9 legs 1 8 9 


36. The buzz of a fly is made by 
e when—7 standing on à wet floor 1 8 9 


30. Never use an electric applianc 
8 camping 9 in bed 


The following illustrations are for grades 7 and 8: 


4 the bear 5 the mink 45 6 


Which has the most valuable fur? es 
6 1 + о о 

2. ыга are on the centigrade thermometer is—70 8 100 | 8 ? 

39 а mples 0—7 minerals 8 pro- 


Tron, lime, and phosphorus are exa 


teins — 9 enzymes E А 
Th : Achievement Tests! place considerable emptis 
on pls. Мепорошш = d their relations. Note the much es E. 
ics 4 e inter- 

ег of items dealing with astronom? Pos an 
*diate and advanced batterie? ERR Yr 
World Book Company, Yonkers, N.Y. 


1 
Ttems by permission of 


252 PROBLEMS OF MEASUREMENT 


А fot 
Н à itable 
the usual number of items. Samples from this battery su 
grades 7 and 8 and lower 9 are: 
„i have 
32. Telephone wires make a humming noise between poles os кан 
high pitch 2 ere stretched taut 3 carry electricity 2 estivates 
18. When a raccoon goes to sleep in the fall, it—1 propagates К: 
3 hibernates 4 migrates. : arbori diox! 
41. To burn food the body needs—1 oxygen 2 air 3 c 
4 hydrogen. Р 
item 
: А . . es of 1 ^ 
It is quite evident from the discussion and from the samples. rm 


SPECIAL TESTS FOR SCIENCE testinÉ of 
strations of entire tests devoted to the 


res Л 

; res D 
к Cooperative Science Test for Grades 7, 8, and 9 meast sho" 
of the obj 


> ivide E 
с bjectives set forth on pages 248 and 249, It is divided ^ 
in the accompanying table. 


Part | neer of minutes 
| item 
a) DÀ M 
1. Facts, Skills, and Application. .. | 75 E 
II. егтѕ and Concepts P nans ` 45 3 
III. Comprehension and Interpretation II" a | Z- 
ТОЮ, ee erm. P 80 
day 
губу. 
The items of Part 1 «ce in ev tti” 
e ате tak ich aris gel V. d 
living. The facts and skills еп from problems whic ful 5© yin 


А jn w 

, the superstition involve carried w 

‘S, what sleet is, how malaria g childr je 

c. ains: is the best food for growing sm 
ese į 


5 ics 
llustrate the rich variety of toP 


TM „ево 
‘Items used by Permission of Educational Testing Service, Princet 


v 253 
MEASUREMENT OF SCIENCE 


12. Children of school age are vaccinated as a protection against 

12-1 malaria 

12-2 smallpox 

12-3 tuberculosis 

12-4 scarlet fever 

Bm i the hardest: 
52. Of the following substances, which are 

52-1 iron 

52-2 steel 

32-3 cement 

52-4 diamond 

52-5 granite 


ognizes that in terms and concepts 
g 


Er E Fee REL E E of meanings and generaliza- 


whole areas » Miis ial. 
requently are concentrated v o in reading scientific material 


ons hey are of the first importance als і A = 
The [ ollowing е us ы Dm “constellation,” “architect, 
Calorie i 1 on t : slain “respiration ” “experimenta- 
d де isinfectant,” © mammal “element ” and “Ъаї- 
ant, . 
i ^ ве » «oxidation e , 
lon," “microphone, XL , 
› 


tery.” Two items are: — 
ci the surroun 
ical world making up 
11, The imals, and physica! 
plants, animals, 
re called his 
11-1 adaptation 
11-2 heredity 
11-3 environment "T 
Cb иа ing the ocean at great 
im Ушу been used successfully for exploring 
* A device which has bee 
depths is the 
37-1 bathysphere 
37-2 hydrosphere 
7-3 stratosphere 
-4 vivarium 
37-5 depth bomb 


37 


370) 


і i six 
nd Interpretation, 15 composed of 
jon а 


i itten and con- 

ther simply writ t 1 
i dere d on their appli- 

4 aBraphs on pa pes of the paragraph an 

11 questions both on 


i the best 
ix aphs include one on 
ys. and interpretation. hem js mma on the importance of the 
and interp 


dm the inter- 
n ant tulip i h, and one on t 
ways to preserve and а Pa., to Eirtibush ee po ee ia 
Ty turnpike from puns oxygen. One of be 9 ааа 
ction between plants aa what the aie am: 
аан 0 онази n teaching of reading 
15 now well recognize 


Part III, Comprehens 


254 PROBLEMS OF MEASUREMENT 


sng ol 
ы tanding 9 
grade and throughout high school. The teaching and unders | 
reading materials in science is of the first mico A Ruch-Popents 
Another good test in the field of general science is | ные” Sf " 
al Science Test.’ This test, published in 1923, is җи daw 
з Part I, on terms and concepts, and Part II, consis m ol 
p accompanying questions. Part I consists of 50 pem ex p 
im It uses the multiple-choice form of testing, ke р" gen m 
T бағ 50 terms sample well the material usually e. айай 
science course. Samples of concepts are * oxidation," “р 
“ductility.” Two illustrations are: 


17, The act of transfer of pollen from anther to stigma is called 


Ө mitos 

pollination reproduction fertilization transpiration 
adaptation filtration. 

46. Glucose is found in large quantities in . tapioct 
eggs grapes olive oil beefsteak onions rice 


gesti 

0 drawings, with two to t pis 

ach drawing. The questions are either of ho tions NC 

: Drawings with their appropriate que йоу ing 

oblems as the names of the parts of а le 
principle of the lever 


i 
, the ш 
; the mechanical efficiency of pud of ? 

power of a pump, and the understanding of the general p 


‚йе. 
ficial freezing. One illustration will show the general techniq 


i t: 
In this diagram of the digestive trac 


а 
@ The small intestine is lettered. - ++ ' b 
b The cesophagus is lettered 5757 с 
€ The liver is lettered vt d 
d 


The stomach is lettered 


The pancreas is lettered 77 
hi es 40 That 

‚ This test has two forms, а reliability of -83, and consum о? 

in the taking Probably its 


2 iure 
5 Breatest weakness is its failu 
reading test of s 


„а 4 
cientific materia]. ; D ive 
For the high school there is A "Test of General profici? oY „дй 7, 
Field of Natural Sciences by Paul J. Burke, one of the ui Par rg. 
General Achievement Tests. Tt also is divided into two um prett qe 
terms and concepts, anq Part TI, comprehension and in Fu mi? 
The time consumed in the actua] i 


MEASUREMENT OF SCIENCE 255 
Р H + 
А ica anka questions about the meaning of “fossils,” “calorie,” 
сс ? 1 
rasive," “ion,” “lymph,” *wiggler," “the momentum of an dbs 


М » 
ject,” and so forth. There аге 50 items in all. One illustration is:! 


43. A ү" 
substance which increases the number of hydrogen ions in a solution is known 


as 

43-1 a base 

43-2 a salt 

43-3 a buffer 

43-4 an acid 

43-5 an alkali 
cs II deals with the understanding of paragraphs. Many of the 
e pm ask that the subject apply the principle stated in the para- 
tute e new illustrations. There are two reading selections and one 
тае ealing with the amount of theoretical horsepower required to 

т > water to different heights. From this table several problems in 
Physics are constructed. This test covers two areas of general science, 


bi " р К 
iology and physics. Percentile norms are available. 
TESTS OF SCIENCES IN HIGH SCHOOL 


Trests OF BIOLOGY 


Our best standard tests in biology sample well the information which 


еш һаз acquired. Frequently items are so arranged that some 
"is oning and thinking are required to arrive at the correct answer. In 
бете two tests, subjects are asked to predict the outcome under the 
We itions named. In no cases are the reasons required for the conclusion 
Mon. nor are hypotheses to be formulated for the explanation of facts. 
e eover, both the planning of experiments to solve pressing problems 

the understanding of the nature of proof are neglected. It is, there- 


Ore, well to remember that none of the tests described measure all the 
ways important to ask about 


o + 
ош of the teaching of biology. It is al to 
Y test, “What aspects of biological instruction does this instrument 
etter types of stand- 


est well?” 
The Cooperative Biology Test! is one of the b of s 
d on information usually taught in biology 
d to answer them 


arde 
"dized tests. Its items, base 1 
established. The reliability 


Correctly, The norms of this к 
s .94 and thus is satisfactory. 


as : А 
Computed by the odd-even technique i і ‹ 
is test is divided into two parts: Part I, which requires 25 minutes 


oft А А 

r testing time, is composed of 15 items. Many of the items are taken 

Sis Problems met in daily life. How to get rid of cockroaches, what to 
тту about in case of termites in the neighborhood, types of insects 


al Testing Service, 


It i 
em by permission of Education: Princeton, N.J. 


256 PROBLEMS OF MEASUREMENT 


t 

: . do abou 

which reduce yield from a hay field, what is the best thing WA illustr?" 

influenza, and what a morning sore throat implies—these hologic?^ 
tions Physiological material is emphasized more than morp 


"ona 

ict f ; functio 
Two samples from Form Q will indicate the manner in which 
information is tested: 


deep se 
28. Which of the following factors is part of the normal environment of 
organisms and not of land organisms? 
28-1 The presence of oxygen 
28-2 The presence of mineral salts 
28-3 Great pressure 


28-4 The presence of natural enemies 


280) 
28-5 Freezing temperatures 


pil 
49. 


, in chlor? 
A certain species of land plant develops broad leaves which contai 
This indicates that this plant 


49-1 grows best in dry regions 
49-2 will grow only on acid soils 
49-3 is able to make food from c 
49-4 is able to survive extreme 
49-5 is probably a type of fung 


arbon dioxide and water 


) 
variations in temperature 49( 
us 


Part II, which re 
matching problems 
drawings, and also 
of a tooth, for exam: 
a tooth; from dra 


oth 


"7 nd " g 
NES 5 erst jn 
; Which in most cases involve the und aw 


Wings, he must recognize certain ee item Tt 
On the other hand there are no drawings in the last Je th n^ 
questions are asked in Such a way as to require considera ü a е 
to answer them correctly. One must recognize a disease via iP ist 
made less common by a better diet, and f rom the knowledge Р 


е £* 
i ts 9 ar 
enerate lost or injured p Po m Q 
rfish. Two illustrations fron 


36. A single-celled organism i 
vacuoles, and a cell тет} 
plant rather than an ani 
36-1 The cytoplasm 
36-2 The chloroplasts 
36-3 The nucleus 36( 
36-4 The cell membrane 


О 

hl 
us, Onis 
з found to have —cytoplasm, a nucle p 


e or 
brane. Which of these indicates that th 
mal? 


mu 
We; 
36-5 The vacuoles paired jj pt jp 
42. L represents long-haired, which is dominant; s represents iat P wi 
recessive; LL is crossed with ss. The offspring in the first gene” 
the ratio of 


42-1 2LL + 2ss 


MEASUREMENT OF SCIENCE 257 


42-2 4LL 
42-3 4ss 
42-4 LL + 2Ls + ss 


42-5 
не 4X) 


Percentile norms are available for this test. 
is : in ip test, the Ruch-Cossman Biology Test, despite its age (1924) 
EE y of consideration. The items for this test were selected from 
e, ination questions supplied. by 126 teachers, who were asked to 
rie in to the investigator copies of the examination questions used 

uring that year. From the 2,000 questions received, the 300 occurring 
most frequently were selected. These questions were then rated by 

68 teachers and 9 authorities." Each question was rated “1” if entirely 
repa уг “2” if partially satisfactory, and “3” if entirely unsatis- 
Соту. Most of the items selected for the test came from those rated 

1.” The test’s reliability ranging in coefficients from .76 to .87 as com- 
pored from populations ranging in age from 12 to 28. By combining 

orm A and Form B into one test a satisfactory reliability of .90 or 
above was obtained. The probable error of measurement is three points. 
т The Ruch-Cossman Biology Test is composed of five tests or parts. 
— 1 is composed of 40 terms whose correct definitions or meanings 

Ppear among seven possible answers. The student is asked about the 
action of gravitation on roots, about chlorophyll, what mandibles, 


e е 5 
Nzymes, and collar cells ате. Two illustrations are:! 


38, yo which most closely resembles a hollow ball is the 


The stage of an embr 
gamete gastrula chrysalis zoóspore. 


ovun blastula upa 
37. Fehli a рир 
7. Fehling's solution is a test for | 
fats ^ cellulose glucose albumin 
f 18 incompleted stateme 


f three statements 


starch proteins minerals. 


Test 2 is composed o nts which are com- 
Pleted by checking one o 
13.7 

3. The arthropods always possess , 
—__ Three distinct body regions 
— Two pairs of wings 
—  ——. Jointed appendages 
Test 3 matches 18 names of structure 
Tàwings, 

Test 4 has two illustration 
W est 5 is made up of five paragraphs, 
AM omitted. The usual difficulty foun 

Present in this test. И | di tid 

his test samples we rth-while facts earned in a biology 


s with their positions in four 


s of the working of Mendelian inheritance. 
each of which has certain key 
d in marking completion tests 


J] the wo 


k Company, Yonkers, N.Y. 


1 
Items by permission of World Boo 


256 PROBLEMS OF MEASUREMENT 


: bout 
which reduce yield from a hay field, what is the best thing to UE 
influenza, and what a morning sore throat implies—these ne went 
tions. Physiological material is emphasized more than morp 


AS. : : actiona | 
Two samples from Form Q will indicate the manner in which fur 
information is tested: 


28. Which of the following factors is 
organisms and not of land organi: 
28-1 The presence of oxygen 
28-2 The presence of mineral salts 
28-3 Great pressure 
28-4 The presence of natural enemies 


28( ) 
28-5 Freezing temperatures 
49. A certain species of lan 


This indicates that this plant 
49-1 grows best in dry regions 
49-2 will grow only on acid soils 
49-3 is able to make 
49-4 is able to surviv. 
49-5 is Probably a {у 


" еер 52 
Part of the normal environment of deep | 
sms? 


Jl. 
: ; ophY 
d plant develops broad leaves which contain chlor 1 


pe of fungus 


| last? 
msm is found to have—cytoplasm, a nucleus, E je 
> ante ae Which of these айсар thet the OT 
plant rather than an animal? 

36-1 The cytoplasm 

36-2 The chloroplasts 

36-3 The nucleus 

36-4 The cell membrane e 

36-5 The vacuoles 36 js 
42. L represents long-haired, whi 

recessive; LL is crossed with ss, The offs 

the ratio of 


42-1 2LL + 2ss 


36. A single-celled Orga; 
vacuoles, and а cell 


MEASUREMENT OF SCIENCE 257 


42-2 йл, 
42-3 455 


42-4 LL + 2Ls + ss 
42-5 4Ls 
42 ) 


Pe А 
олш are available for this test. 
їз Worth, у the Ruch-Cossman Biology Test, despite its age (1924) 
examinati Я consideration. T he items for this test were selected from 
Send in чыр. questions supplied by 126 teachers, who were asked to 
iting Pee the investigator copies of the examination questions used 
most fre at year. From the 2,000 questions received, the 300 occurring 
“ fea, were selected. These questions were then rated by 
satisfact ers and 9 authorities." Each question was rated "1 » if entirely 
actory 4 2" if partially satisfactory, and «3? if entirely unsatis- 
01у. Most of the items selected for the test came from those rated 


“ 
ә 
The test's reliability ranging in rom .76 to .87 as com- 
"ng in age from 12 to 28. By combining 


0 
he [o obtained. The probable error of m 
est | iei аа Biology Test is compo 
Dpear ay composed of 40 terms whose correc 
action i i, seven possible answers. The stu 
Enzymes gravitation on roots, about chlorophyll, w 
28, т ; and collar cells are. Two illustrations are: 
Es Stage of an embryo which most closely mbles a hollow ball is the 
37, Pehi, blastula pupa gamete chrysalis  zoöspore. 
ats ing’s solution is a test for | 
cellulose glucose albumin starch proteins 
hich are com- 


Test 9; 
3t 2 is composed of 18 incompleted state 


reser 
gastrula 


minerals. 


ments W 


Plet, 
ed 
8 ss by checking one of three statements 
E The 
arthropods always possess 
~ Three disti : 
ree distinct body regions 
7—— Two pai : 
E о pairs of wings 
Joint 
Te nted appendages 
St Р E P Р 
drawings matches 18 names of structures with their positions 10 four 
s. 
e " H H 
Тең : has two illustrations of the working of La re pape 
Wo > is m aphs, each of which has certain key 
^ ade upar Бе рано, d in marking completion tests 


y tds. oms 
ls pre, mitted. The usual difficulty foun 


Se N 

T is E. In this test. 

tem st samples well the W° 
5 by permission of World Bo9* Compan 


ile facts learned in а biology 


rth-wh 
у, Yonkers, N.Y. 


1 


258 PROBLEMS OF MEASUREMENT 


cit 
course in high school. It does not attempt to test a student's capacity 


i s, ОГ 
to formulate hypotheses, to set up experiments, to test hypotheses, 
to reason logically. 


TESTS or CHEMISTRY 


I 

The Cooperative Chemistry Test is divided into two Lasser d 

which requires 25 minutes of testing time, contains 56 items € require 

in the best-answer manner. With few exceptions the questio ses The 
a functional understanding of the chemical terms and processes. 


А : . from 
second part contains 39 questions. These two illustrations are 
Part I:? 


11. Some paints darken on standing. This is caused by the formation of 
11-1 ZnS 


11-2 ZnSO, 
11-3 PbS 
11-4 PbSO, nt Ў 
11-5 TiO; ч 

+ The catalyst in the contact process affects which of the following changes 
31-1 S + 0, SO, 
31-2 2802 + 0,— 280; 
31-3 HS0, + S0; H38:0; 
31-4 SO; + H.0 > H,so, 


3 


Ned 


a) 
The next two illustrations are from Part II: $ 
t ve 
10. When COs is bubbled into limewater, a white precipitate forms which qus is 
upon the further addition of СО,. The substance finally remaining in solut 
10-1 СаО 
10-2 Ca(OH), 
10-3 Ca(HCO;) 
10-4 CaCO; 00 
10-5 Ca2(OH).(CO,) 19 an 
25. One of the products in the completely balanced reaction between ZnCle 
AgNO; is 
25-1 ZnNO; 
25-2 AgCl, 
25-3 2ZnNO; 
25-4 2AgCI ) 
25-5 ZnAg(NO;), m 


Some of the Questions in this 
ject is asked about t 


MEASUREMENT OF SCIENCE 
259 


ап instrum 
aking ponds to test a storage battery, or to k 
А лл он ie = why most gold in use Б re лы: : 
» and ability to the items would fall under the head “ x iW cde 
Bst Fora P use, fundamental tools of chemistry." The коа 
, expect to measure five principal types о * 
ectives: 


(а) Kn 
o 
AS i пето S and understanding of с 
Kn 
С ИИ of, and ability to use, 
с 
in анна апа аррге 
(d) X quem and in daily life. 
Chemical y to perform correctly simple basic с i 
) 1 alculat. i 
(e) SE EN involving the application of chemical асов 
ge and appreciation of great chemists and gen 


со : 
ntributions. 


hemical laws, principles 
fundamental tools of 


ciation of applications of chemistry 


Аз а wh 
ole 
ae dge ee ves well what it sets out to do, but probably “tests 
Ea Кс а ility to use fundamental tools of chemistry best of 
ooks in cl items were based on the content of four widely used 
hemistry they cover well the field of traditional chemistry 
Ц 


t no 
t so wi 
зы уе field of modern chemistry. 
ive forms N, O, and S and has satisfactory norms. 


nor: ? 
ie oue percentile points are furnished for both public and 
dai fom the ry schools. Its reliability 15 well above .90 when com- 
Win n with imm secured from the members of one class. Its cor- 
in in intelligenc ool marks run from 63 to .78. Traxler showed! that 
lependent e constant the correlation of this test and school marks 
Мое t secondary schools is .64- 
ace ided p is the Columbia Research Bureau Chemistr, 
Ordi o three parts. Part I consists of 150 items с 
s are: 


г 
Ing t 
о dL US 5 2 
true-false principles. Two illustration 
n metallurgical industries because it isa 
( 


Cale, 
Colle 
buted ү 


y Test.? It 
onstructed 


reduc- 


— 


* Cart 
А On m 
" "E 
onoxide is important i 
source of fixed 


u, stent, 
йт 
ni ni i Р б А 
itrogen, trate is the only commercially important mineral 
ea) 
completion 


xercises which deal with the 


art 

a Ili 

nd h, I is composed of 22€ 
amples are: 


alanci 
ы of equations. Ex 
i 
€ copper placed in an aqueous solution 
1 
is 4 aes 
di Review rthur E., “Correlation of Achievement Scores and 
ems pa (1937) 45: 198-201. 
Permission of World Book Company, 


of silver nitrate 


School Marks,” 


Yonkers, N.Y. 


260 PROBLEMS OF MEASUREMENT 


14. The action of water on phosphorous tribromide 


Eo|-[ ]-[ ]«[ ] 
Part III contains 10 problems to be solved. Two items are: 


zi ation: 
4. Calcium carbonate is acted upon by HCL according to the ^g ka of 
CaCO; + 2HCI— CaCl, + CO, + Н.О. Suppose you have 50.0 g b 


ou wil 
CaCO; to convert into CaCl», what is the minimum amount of HCL y ( 
have to furnish (in grams)? 


(Atomic weights: Ca — 40,C = 12,0 

. A gas under a pressure of 800 millimet 
27°C. occupies 100 liters, 
if the pressure is d 
177°С.? 


= 16, Cl = 35.5, Н = 1) "T 
ers of mercury and at a temperatu 


й occupy» 
ters will the same weight of gas to 


How many li 


aised 
ecreased to 400 millimeters and the temperature г 


TEsts or Рнүѕісѕ :eted at 
Only one test of Physics will be described, but others are liste 
the end of the chapter, 
€ Coope 
Such à manne 


t 
ntroduced with an incomplete $ the 
lowed by five choi ; 


S of the test аг 


ел 
y the test contain nearly the same number © e 
and require 20 minutes of Working time each. Part I has an irreg ub 
arrangement of items dealing with man ; 
Ject must jump fr 


:< 00 
Р | 8 Wo forces. The second part ound 
unified, having 22 items en electricity, 14 on light, and 6 on © 
The following ex rt I3 i | 


Testing Service, Princeton, N.J- 


MEASUREMENT OF SCIENCE 261 


18. A stone fal y 
18-1 8 qe from rest. At the end of 14 second its speed is approximately 
18-2 2 ft. per sec 
18-3 16 ft. per sec 
vr 4 ft. per sec 
29, 32 ft. per sec 
i d calorie is a unit of 
-l weight 

c temperature 
-3 force 

29-4 power 

29-5 energy 


The 
se examples are from part H: 


9. Th 
. The electri Я j zoi 
ectric current is a horizontal wire 15 from south to north. If a compass 


ge is placed beneath the wire, it 
92 шы 
9.3 си downward 
9.4 ны upward 
9.5 ren toward the east 
The fa ected toward the west. ' 2. 
ханы that a candle flame gives 2 continuous spectrum 15 evidence that it 
so luminous gases 

ae unburned gases 

6-4 А of different temperatures 

6-5 roplets of warm liquid 

Particles of an incandescen 


s N-pole will be 


26, 


t solid. 
Thi А 
able test is well standardized. It has forms N, O, P, Q, and S avail- 
5 well as percentile norms poth for preparatory schools and public 
{ measurement are also pro- 


Scho, 

Yt Scaled scores and standard errors Of 1 f 

92 7 The reliability for the 40-minute test is in the neighborhood of 
s test. In the first place, 


ticisms of thi 
Jeaves little 
larl 
ne item to а 
rely require identifica- 
w more problems 
the test. 

ther physics test 
derstanding of 


t has а reliability of 


time for contemplation. 
y in Part I of the test 
nother. In 


T 
ae are three or four minor cri 
es to be done in 40 minutes : 
S m by arranging the items irregu 
the үр Ject is compelled to shift quickly from о 
Чо та place, there are some items which me 
Which d ne-step mental process It is possible that a fe 
Sa emand reasoned under? improve 
a pode this test satisfies > 
8ni a of a good test, large ¢ 
2 icant facts t principles о physics. This tes 
97 and a correlation of 73 with school marks. 


ore 
because itt 


262 PROBLEMS OF MEASUREMENT 


INSTRUCTIONAL TESTS IN SCIENCE 


isdell 

Four instructional tests are here described. These are (1) oe +h 
Instructional Tests in Biology, (2) Glenn-Welton ае Sige a es 
Chemistry, (3) Glenn-Obourn Instructional Tests in (R^ ais 
(4) Glenn-Greenberg Instructional Tests in General Science. stry. For 
25 units of work in biology and physics and 36 units in E D ss 
each unit of work presumably taught from any standard text ie se 
is a complete, standardized test composed usually of 25 to 50 i e ofa 
in some cases a longer over-all test by way of review at the er 
division. 


x ts are 
The authors claim, and for the most part justly, that these tes 
useful in the following ways:1 


1. to provide information about 
to base instructional-practices. 


d 
2. to diagnose learning difficulties of students and study 
the nature of their 


" innin 
errors separately for each unit of beginning 
Chemistry. 


3. to make frequent inventori 
diture of time, 


ich 
student achievement on whi 


ith а 
es of a student's success Wi 


k 
poo 
! Manual for Glenn-Welton Instructional Tests in Chemistry, P III. World 
Company, Yonkers, N.Y, By Permission, 


MEASUREMENT OF SCIENCE 263 


SCIENTIFIC THINKING 


is EN thinking is an outcome of every scientific relationship that 
Шеп ed, every problem that is exactly solved. It is developed when 
Tt is 1 taught to delay their inferences until all the data are in. 
fes на uraged when a student makes no statement in geometry un- 
analyses grounds for his proof are also presented. Wherever critical 
beginnin are made of the facts presented, there scientific method is 
ing 5 Ri Finally, when an individual acquires a mind which is will- 
made ccept the facts and draw his conclusions from them, he has 
foe ee toward scientific thinking. р . 

A е characteristics—of perceiving relations in scientific data, of 
are in 5 accuracy of result, of withholding inferences until all the data 
of ho Е asking for more data, of demanding grounds for statements, 
and to ет y analyzing data present, and of being willing to accept facts 

raw conclusions from them—are well known to all teachers of 


Scie z 
nce. Many of them, however, are too much carried away by the 
ter to take the trouble to 


oa е 
„ad of detail which their students must mas 
Scientific method has a 


Inst Ege 
Tuct students in the scientific method. 
developed with one sort of data, 


ps transfer value. Properly : lai 
o rond illustrated, and contrasted with the method of superstition 
ty jen sense, it extends far beyond the biology, physics, or chemis- 

ere it is learned to much broader areas of the social sciences and 


Dite e ce 
inking in general. 
i € tests suggested here are pretty largely checks to see if the stu- 
il we are able to apply their scientific method to new situations. The 
S m introduced below tests to see if a pupil can draw the right 
rea, Plon from rather simple data and then can check the correct 
asons,t 
Form 1.3 
Di APPLICATION OF PRINCIPLES 
rections: ;« given. Below each 
р i$: Т: А blem 15 £ S 
т n each of the follow s statements hich can be 


em е 
Use are two lists of statements. s 
check mark (м) in the parentheses after the 
i ntains state- 


State 
Ents wh: 

wh Я E 
чие can be used to explain the rig Ss, give the reasons 
алзы, CSES after the statement OF stat Dice iin the right answers; 


do po, © Some re true but ao not е 
io ше statement ses then, you are to place 4 check mark ( 


Ing Check thes i xerci 
e. In doing these € Sich answer the prob 


e 
Sb | саны after the statements Y 
А ге i р 
p, Pith, еа dei: ЕУ Tyler, & al^ Appraising and Recording Student 
ress, Pp. E А5 c» у арт & Brothers; 1942. By permission, 


264 PROBLEMS OF MEASUREMENT 


8 i тар a bottle 
eather people who do not have refrigerators sometimes w я Would 
с. ОШ wet towel and place it where there is a good circulation o: 
orm. Y 


t 
"EP i ithout a we 
bottle of milk so treated stay sweet as long as a similar bottle of milk with 

а bottle 

towel? 


A bottle wrapped with the wet towel would stay sweet 


a. longer than without the wet towel 


b. not as long as without the wet towel 


c. the same length of time—the wet tow 


E 
el would make no difference.... ( 


ear 

umn are used in scoring. They do not арр 

on the test. І () d. 

Superstition d. Thunderstorms hasten the souring of epe 

Right Principle ё. The souring of milk is the result o (26 
Erowth and life processes of bacteria. d 

Wrong 7. Wrapping the bottle prevents bacteria from t hE 
getting into the milk... mee the 

Wrong 8. A wet towel could not interfere with ( ЭЁ 
growth of bacteria in the milk....... pr is 

Wrong À. Wrapping keeps out the air and hinde (2r 
ЫР... ыз cuyos septa а 

Right Principle i. Evaporation is accompanied by an absorp: й 7 
tion of heat 

Authority 


„шй P Heins эч йа был a see 2 
J. Milkmen often advise housewives to wrap 


(M 

bottles in wet boo M esi 3 ilo: 

Unacceptable Analogy р Just as many foods are wrapped in ce 
Phane to kee 


Sweet by Wrapping a wet towel around the () » 
j bottle to keep the moisture in.. ouso sea te 
Right Principle l. Bacteria do not grow so rapidly when temp- () 
eratures are kept low... ,... in 
A Second illustration involving Pretty largely the facts learned : 
science is now given. In this 
distinguished, 


ao mif 
case facts and assumption 
Exercise 21 


te ashes. Bill 
ated, gave off 


«паї 
Natio g 
nderstanding » Forty-fifth Yearbook of fiversity 9 
ission, "t b pp. 132-134; Chicago: Un 
Chicago Press, 1946. By permission, 


MEASUREMENT OF SCIENCE 


265 


the Бош 
left wis — gas. The magnesium ignited, burned with а bright light, and 
shes. Bill told his friends that his results conclusively proved that the 


col 
Огей gas was chlorine. 


Part NT 

"s tm иш Read each statement 

an iom statement a FACT, or is it 
) in th MPTION? Place a check mark 

statement. appropriate column before the 


Part 2. Directions: Read over again only 
those statements which you have 
marked as assumptions. Place a check 
mark (V) after those TWO ASSUMPTIONS 
which are absolutely necessary in prov- 
ing that the gas was chlorine. Do not 
mark more than two. 


Statements 


Assump- 
Pact tion , 


— 4 


magnesium. 

. Chlorine gas is the 
will ignite. 

. Chlorine gas is the 
will ignite, burn wi 
ashes. 

. Bill mixed 
off a colored gas. 


leaving white ashes. 


will burn with a bri 
. Bill collected some o 
i. The properti 
the only caus 


k. The proper! 
not the cau 


Are 

yo В 

е, еп : learning how to develop 2 
ара a arguments for or against some 


Chlorine is not the onl 
will burn with a bright 


. The material the chemistry teacher gave him was 


and heated some chi 


. A small piece of magn 
with a bright light in an 


. Chlorine gas is the only g 
ght light. 

f the colored ga 
es of the colored gas in 

e of the magnesium i 


with a bright light, ап 


j. Bill put a small piece 9 1 
ties of the colored gas in 


se of the magnesiu 


with a bright light, and leaving 


. The magnesium ignited, 
and left white ashes. 


y gas in which magnesium 
light and leave white ashes. 


only gas in which magnesium 


only gas in which magnesium 
th a bright light, leaving white 


emicals which gave 


esium will ignite and burn 
atmosphere of chlorine gas, 


as in which magnesium 


s in a bottle. 

the bottle were 
igniting, burning 
hite ashes. 

in the bottle. 
the bottle were 
m igniting, burning 
white ashes. 

th bright light, 


d leaving W: 
f magnesium 


d, burned wi 


in newspapers, 


are presented 
ld have been 


discussion cou 


Baz 
m, Sines, s 
Deeches, or textbooks, We oft 
to p, More logi Y Е ; ts that are really unnecessary 
г ogical. imes put 1n statemen a n 
o% Апо somet! В portant arguments; on still 


thej м 
heir point; at other times t 


hev 


leave out im 


266 PROBLEMS OF MEASUREMENT 


clusion 
ccasions they arrange their statements in such poor order that the con 
pe t seem to be based on or to grow out of the arguments. — der to prove 
ше Directions: Suppose you were describing this experiment in en statements 
that Hone gas was collected. What are all of the absolutely ped statements 
in the complete development of the proof? Use as many of the a 


er order! ОЛ 
as are necessary and place the letters of these Statements in their prop! 
the line below. Do not use any unnecessary statements. 


e colore 
t it has been adequately proved that th 
gas must be chlorine. 


:nion 
al opin! 
Check the following statement which best represents your own person 
as to the nature of the gas. 


— —— 4. I believe that th 
— — }. Idonot believe that the color gas 


Write out the reasons you have to Support your opinion. 


or 
s a К nd ро 
Evidence concerning the students understanding of good Wes ele- 
analogy, avoiding a repetition of a conclusion and certain 9 


licatio? 
responses to test it ike one described under ApP 
of Principles, page 263, 


veloping procedures to inculcate 1 tests: 
thinking Scientifically, Illustrations of inform ess of 
Which the teacher can utilize or imitate, to measure the prog" 
students in using the scientific method are there presented 
ATTITUDES AND INTERESTS IN SCIENCE dency? 
Attitude, asis Pointed out in Chap 17, consists of a learned te object 
Set, or disposition to act favorably Or unfavorably toward an he sel 
process, situation, or person. It is not the habit of accuracy bu I 
or disposition to be accurate, 
d 


MEASUREMENT OF SCIENCE 267 


discover the pres- 
ire of the students 
onnaire or self-rating scale. 
anecdotal records of those 
he lack of it. Up to the 
t fruitful. In the class 


т i 
present time, the second procedure has been mos 
ies which reflect 


25 reporting, and the inclination to g¢ 
d activities opportunity is offer ү 
ба Ons. Suppose we add to these opportuni an 0 era 
m е. Accurate reports of electric stoves mended, of pigs raisec; of fai 

achinery put in service, or of animals bred and raised in a scientific 
manner furnish further indicators of attitude and interests. Books and 

agazines read are a third source of valuable information. Science and 
vention, Popular Mechanics, and such periodicals contain much e? 
refa Out science, and if a boy reads regularly such a Doe € 
en Scting definitely his interest. Tf all these activities 1n nie e Шш p is 
Се of interest and a scientific attitude, then there is little r 


do 

Th that it is present. а а 
to р © best procedure to quantify such attitudes z arene b 
аны MUlate a check list of activities which гес; adi п 

s "des. Each activity should then be given à WE mie 
atti €acher’s best judgment as to its value 1n E ы Але 

a "de. The list and the weights would be modified ate TE 
abl а fairly stable form for that community could a oh 
“e € student should not be aware of the check hs Tor uM. 
ас ы cavers” and “teacher pleasers” would be ВЕБ uu 
list Abt possessing the attitude. Such а careful уре ene 
9ne “ould indicate to the teacher whether the siden eer 
© most important outcomes of science teac g 


© attitude. 


interests would be 


Scien tif 


1 

Jorg 5. New York: Henry Holt and 
Compan” A. M., Educational Psychology, P- tae: 

Inc., 1942, By permission. 


268 PROBLEMS OF MEASUREMENT 


SUMMARY 


А n de- 
Five types of objectives for the teaching of science uli 297 ihe 
cribed: (1) the learning and understanding of scientific fa = ^ til deem 
edu of the scientific method, (3) the development i кет 
of habits of healthful living, (4) the development of age tte ване 
problems, and (5) the development of the appreciation o atonal 
of nature. The greatest success in measurement has been in : suitable 
and understanding of scientific facts, This fact is true of tes 
both for the elementary and the high school. 


: " ; mation 
In the elementary school, standardized tests of science infor 


. e in- 
Tests of the presence and application of scientific thinking We алі 
cluded to indicate the direction th 


t be 


LISTS 
І. GENERAL Scrence 


1. Analytical Scales of Attainment in 
Elementary Scien 


OF SCIENCE TESTS 


. Test, 
e 
3. Cooperative General Scienc 


, 
high school. 1939-1947. аа а 
Ce, grades 5-6, 7_g 9, and X. Time: 40 minutes: Aoi Se 
1933. Three levels. One form. ‘Time: Underhill. Cooperative Те 
45 minutes. Authors: M. : Van Wage- New York, Nation 
nen and August Dvorak. Education 4. General Science Те, о, 1 36. 
Test Bureau, Minneapolis, Minn. Achievement Tests, grades 4 
2. Applications of Principles in Sci- 1939. Two forms. Nontime 
ence, grades 9-12. 1949, ne form, minutes). Authors: S. Res BE 
Time: 60 minutes. Authors: Committee Robert K. Speer, Lester e^ hing C?" 
of Progressive Education Association, Samuel Smith. Acorn Publis 
Evaluation in the Eight Year Study, Rockville Center, N.Y. 
Chicago. 


des 
А gre 
5. Science Information Test, 


MEASUREMENT OF SCIENCE 


"as Two forms. Two levels. Non- 
ad (about 60 minutes). Elementary, 
kens 4-6; intermediate, grades 7-9. 
les pee Everett T. Calvert. Los Ange- 
6 Т if., California Test Bureau. 
in the 67 Test of General Proficiency 
eniti ield of Natural Sciences (Coop- 
oie high school and college. 1947 
40 m series). Several forms. Time: 
et al poses Authors: Paul L. Burke 
! al. Cooperative Test Service, New 
ork, 
7 P o petals Science Test, grades 
m . 1941-1947. New forms each year. 
Zim 80 minutes. Authors: John G. 
bg erman, Richard E. Watson, et al., 
Derative Test Service, New York. 
Test Ruch—Popenoe General Science 
and bw high school. 1923. Forms À 
Giles М Time: 40 minutes. Authors: 
World p Ruch and Herbert Е. Рорепое. 
Book Company, Yonkers, N 
Mis ee Test of the Natural Sci- 
ment. High school and college place- 
minut 1939. Several forms. Time: 4 
е es. Author: Carl Р. Swinnerton 
York. Cooperative Test Service, New 


high’ Examination in General Science, 
150 School level. 1945. Form B. Time: 
tion ind minutes. Authors: Examina- 
Stitute of the U.S. Armed Forces In- 
tion, ¢ American Council on Educa- 
York Cooperative Test Service, New 
1 
M McDougal General Science Test, 
40 chool, 1941. Forms A and B. Time: 
mel aes Authors: H. E. Schram- 
Educati Clyde R. McDougal. Bureau of 
ate ronal Measurements, Kansas 
eachers College, Emporia, Kans. 


II. BrorocY 

choo] operative Biology Test, high 
Tine 39-1947. Forms р, Q, S, and 

Fitepat €: 40 minutes. Authors: 
tive p rick, S, R, Powers, et al. Coopera- 
А s Service, New York. 
Sis, Uch-Cossman Biology Test; Bre 
Utes 1724. Two forms. Time: 38 min- 
Uthors: Giles M. Ruch and Lee 


Si 


269 


H. Cossman. World Book Company 
Yonkers, N.Y. И 

3. Williams Biology Test, high school. 
1934. Two forms. Time: 40 minutes. 
Authors: John R. Williams and Н. E. 
Schrammel. Bureau of Educational 
Measurements, Kansas State Teachers 
College, Emporia, Kans. 

4. Application of Principles in Bio- 
logical Science, grades 10-12. 1940. One 
form. Time: 60 minutes. Authors: Eval- 
uation Staff. Evaluation in the Eight 
Year Study, Progressive Educational 
Association. Chicago. 

5. Blaisdell Instructional Tests in Bi- 
ology, high school. 1929. One form. 25 
tests in animal, human, and plant bi- 
ology. One reliable test for each of 25 
units of work. Author: J. б. Blaisdell. 
World Book Company, Yonkers, N.Y. 

6. Biology: Every Pupil Test, high 
school. 1946-1947. New form each year. 
Author: David B. Davis. Ohio State 
Department of Education, Columbus, 
Ohio. 

7. Examination 
school level, grades 10-11. 1945. Form 
B. Authors: Examination Staff of the 
U.S. Armed Forces Institute. American 
Council on Education, Cooperative Test 
Service, New York. 


in Biology, 


III. CHEMISTRY 
: Every Pupil Test, high 
New form each year. 
tes. Ohio Scholarship 
Department of Edu- 
Ohio. 

search Bureau Chem- 
є 11-13. 1928-1929. 
ишу Ле» pd minutes. Authors: 
uel R. Powers, Ben 
ok Company, 


gh 


M. | 
3. Cooperative Gremista Test, M 
. 1939-1947. Revised forms ^» 
pen 24 minutes. Authors: 
R. Pow tor Н. Noll et dl. 
ice, New York. 


hemistry Test, Edu- 


270 PROBLEMS OF 


aratory schools, 1941-1943, 
ee Time: 80 minutes. Norms 
for preparatory schools only. Authors: 
Charles L. Bickel, W. Gordon Brown, 
Robert N. Hilkert, C. S. Hitchcock, and 
H. H. Loomis. Cooperative Test 

rk. 

n н in Chemistry, high 
school level. 1944. Form B. Time: 120 
(125) minutes. Authors: Examination 
Staff of U.S. Armed Forces Institute, 
American Council on Education, Co- 
operative Test Service, New York. 

6. Glenn-Welton Chemistry Achieve- 


Service, 


Yonkers, N.Y, 


T. Kirkpatrick Chemistry Test, high 


School, first and Second semesters, 1940- 
1941. Forms А and B. Authors: Ernest 
L. Kirkpatrick and H 


IV. Pnvsics 


1. Columbia Research В. 
ics Test, grades 11—14. 1926 
Time: 90 mi 


Farwell and 


€ mea: 
method been so reta; ава ы 
2. What аге the 
istics of the Scientific 
in some detail à test which attempts to 
measure aspects of the Scientific method, 
3. What aspects of © Scientific 
method are measured in Such an instru- 
ment as the Cooperative Chemistry 


leading character. 
method? Describe 


Test? 


4. Compare the scien 


Ce tests of the 
Coordinated Scales of А! 


ttainment with 


MEASUREMENT 


ised 
2. Cooperative Physics Test, rev 


Forms 
Series, high school. eph 
P, Q S, and X. Time: 40 minto 
Machine scorable. Used at en Farwell. 
of 2 semesters. Author. H. i York. 
Cooperative Test Service, € Educa- 
3. Cooperative Physics 286 college 
tional Records Bureau ro Forms 
Preparatory schools. 1941-1 Time: 
ERB-R, ERB-S, and ERB-T. Bartlett, 
minutes. Authors: Russell S. ttschalk, 
Lester D. Beers, Winston M. ү Water- 
Robert G. Poland, and Alan ice, New 
man. Cooperative Test Service t 
ЕЁ, tag Test 
ae on Two 
high school, 1934, Two forms tt, 
parts. Test I, mechanics; Tes nd. Time: 
magnetism, electricity, and Senet an 
40 minutes. Authors: V. G. E Educa- 
H. E. Schrammel. Bureau SH 
tional Measurements, we 
Teachers College, Emporia, sonal Tests 
Glenn-Obourn тас ige: 1930. 
in Physics, high school and co е for each 
wenty-five complete tests, on and Ells- 
topic. Authors: Earl R. nienn трал! 
Worth S. Obourn, World Book к 
Yonkers, N.Y, 4 Test, high 
6. Physics: Every Pupil ach year 
School. 1946-1947, New form ee Darwin 
Time: 40 (45) minutes. Autho tment ° 
Kimble. Ohio State Depar 
Education, Columbus. 


tate 


t test 
those of the Stanford Achieve’) and 
as to (a) method of psum what Te 
Coverage of scientific facts. In , 
Spects are they alike? of the 3€ ; 

Compare the contents t batte? 
ence tests occurring in each te _ 
hich seem to you the beso the СЕ 
6. Describe in some tu na 
operative General Science бе reat 
do you justify a test of Se agate e at 
in such a test? Why is the me nt P 
of terms and concepts рю in 5 
are the limitations of the conte 
a test of general science? 


d 
erat 
7. Do you think that the Соор 


MEASUREMENT OF SCIENCE 


Biolo 
to оа applies facts and principles 
Шаа е of practical problems? 
Wis be de pue the records from such 
Why? to influence school marks? 
8. Nam 
e two 1 
а Т otl 
ij penis one of Uum р 
. ow ^ 
са m the test of chemistry 
Vhat te ems ina functional setting? 
tives of is principal types of objec- 
Test? Com ne Cooperative Chemistry 
Search В pare with the Columbia Re- 
types of peu Chemistry Test in the 
ner of jt aterial covered and the man- 
em construction. 


271 


10. Why is there a i i 
physics at the present Sine? WERE d 
of problems are included in the Co s 
ative Physics Test? Does it test m 
dent s ability to formulate hypothes dee 
to M be for an inference? > 

. Show how instructi 4 
the end of each unit pec as i: 
Tue in science. What desirable a : 
escribed? What limitations a. PE 
to Un Ше of such tests? pesos 
. Set up a plan for testi 
velopment of attitudes Uem n 


science. 


BIBLIOGRAPHY 


Curr 
of the Ex Dwicur K.: The Contribution 
tor's Bie lo Understanding, doc- 
Towa, ows State University of 
Davis, Ir 
cientifg ане “The Measurement of 
35) made" Science Education 
IAM Е 
Test Tp enl N. "Testing the 
ics (193 ," School Science and Mathe- 
„ Educa 2) 32:490—502. 
ome ONAL Recorps BUREAU: 
alidity Ey on the Difficulty and 
йору, Ch the Cooperative Tests in 
pRB-R » d and Physics, Forms 
si ttram ü T 1941 Achievement Testing 
mentary ndependent Schools and Sup- 
ords Bully Studies, Educational Rec- 
E orgia M Ner York: Edu- 
Luo d Bureau, 1941. 
ihe cnn hee C.: The Cooperative 
ia asic Pri ests: A Bulletin Reporting 
/ UAM and Procedures Used 
irs sla of Their System of 
S Беу ы. New York: Cooperative 
ucati of the American Council 
s Forty grins 1989. 
p ety ДУ Й Yearbook of the National 
Сы Жы Study of Education, Part 
194 480: no of Understanding.” 
1 versity of Chicago Press 


mat 


F 
Te тс 
E Barer’ Евер P.: « lustrative 
ses in High School Chemis- 


try," Educational К í 
(1937) 16:122-26. = 

Gray, H. A.: “Approach to the Meas- 
urement of Biological Attitudes and 
Appreciations,” Journal of Educational 
Research (1934) 28:25-29. 

Greene, Harry А., ALBERT №. JOR- 
GENSEN, and J. RAYMOND GERBERICH: 
Measurement and Evaluation in the 50. 
ondary School, Chap. XIX. New York, 
Longmans, Green & Co., Inc., 1943. ý 

Hawkes, H. E., E. F. LixpQuisT, and 
C. R. MANN, (eds.): The Construction 
and Use of А chicoement Tests. New York: 
Houghton Mifin Company, 1936. 

Horr, А. G.: «A Test for Scientific 
Attitude,” School Science and Mathe- 
matics (1936) 36 :763-770. 

Nott, VICTOR H.: The Teaching of 
in Elementary and Secondary 


Science 
Schools, Chap- ш. New York: Long- 


& Co., Inc. 1939. 
ODELL, C. W.: Educational Measure- 
ments in High School, Chap. үш. New 
York: 'appleton-Century-Crofts, Inc., 


1930. 

Redirecting Science Teaching in the 
Light of Personal-Social Needs, А Report 
under the Sponsorship of the American 
Council of Science Teachers in Coopera- 
иһ Nine National 


Teachers of 


Science 
» Vol. IV, Proceedings of the 


“Science, 


272 PROBLEMS OF 
Workshop in General Education. Chicago: 
University of Chicago Press, 1940. 
Science in General Education, Report 
of the Committee on the Function of 
Science in General Education, Commis- 
sion on Secondary School Curriculum, 
Progressive Education Association. New 
York: Appleton-Century-Crofts, 1938. 


MEASUREMENT 


Ѕмітн, Eugene R., RALPH pe 
et al.: Appraising and Recording Studen 
Progress. New York: Harper & Brothers: 
1942. — 

Zarr, RosALIND M.: “Superstiti 
Beliefs,” School Science and Мате 
matics (1939) 39:54-62. 


CHAPTER 11 


Measurement of Business Education 


ati 1 с 
y qr iet pasons. Stenographers, typist5; 
€ business world. This need was at that time being met by private 


© r 
Пед, Тһе demands and needs of the time led to the introduction of 
Ctical courses in business in the high school. 
uring the last twenty-five years а new impetus has been introduced 


i : pc 

E business education. Since the publication of Your Money's Worth" 

11 1927, it has become clear that the consumer also needs some training 
strators were wondering if there 


ec Moreover, school administrat as 
of ug ot some cultural values in these business courses which would be 
Чы to the general student. Gradua 
Wo major objectives in business edu 

jobs through such courses as 
and in this connection also to 
business is conducted so that 
functional, and (b) become 
a higher level, such as secre- 


St 
help raphy, bookkeeping, and typing, 
their а to (а) become aware of the way 
аге chool subjects will be immediately 
taria] of opportunities in clerical work at : r i n Fon 
Work, as well as of those activities which require technical training. 
of, 9 make of every individual an intelligent consumer of кше. 
Which poss by acquainting him with the fundamental principles on 
business is based. Here the major emphasis will be upon consumer 


Cation, | 
economic 


Jn py; | 
= i law 

Reogra pasion 2, emphasis will be placed on business /aW, i 

m y, and dee business. Topics such as mus ere 
S zhi ear dl 

Ыг " insurance, taxes, and а host of others whic 

Sumer are the ones to be studied. 


our Money’s Worth. New York: The Mac- 


1 

2 Cha, 
Milen Se Stuart, and F. J. Schlink, Y 
М . "pan й 
Ney onne, co A., Consumer Education 1" the Schools, espec 

tk: Prentice-Hall, Inc., 1941- 


ially Chap. 8. 


273 


274 PROBLEMS OF MEASUREMENT 


PROBLEMS OF TESTING 


: iness edu- 
From the outline of the purposes and objectives = bean of the 
ion just made, it is immediately apparent that the habit forma- 
rs may also be divided into two parts. In one case, dings, com- 
Han and skills are to be measured; in the other, eee: 
prehension, and information are the major considerations. 


CLERICAL TESTS tometer 
If we place stenography, bookkeeping, typing, filing, lio our 
work, and secretarial duties under the heading pes des, and 
major problem is to set forth tests of (1) clerical apti ce the meas 
clerical achievement, In the recent emphasis upon guiden i 
urement of clerical aptitude has achieved an important place. 


Vo- 


innesota 
tests of clerical aptitude was the Minne ed 


cational Test for Clerical Workers 


The 
епі. 

f which are the same and 100 differe to 16 
letters, 


Here are some Sample sets of 


numbers:? 
121. 46273—46273 


90 
126. 627152637490—6271526374 
122. 629—620 127. 73526189 —73526189 

123. 7382517283- 7382517283 128. 5372-5302 

124. 637281630281 


129. 63728142—637281 24 
125. 2738261278961 130. 4783946—4783046 
Ten items of checking names are:? 
121. Bob Fairban 
122. Denton Produ 


y Co.—Wells Dickey Inc. 
124. S. N. Jonas— 


S. N. Jonus 
125. Warren Co.—Warren Со. 


126. Kelly Transfer—Kelly Transfer 

127. S. Karpen & Brothers. s. Karpen & Brothers 

128. A. J. Drexel—A. J. Drexel 

129. C. H. Salmon. 

130. H. Simons L we xi 

^ ap? 

з See Bingham, Walter Van Dyke, Aptitudes and Aptitude Testing, C РТА 

XIII, рр. 322-329. New York: Harper & Brothers, 1937. staff, Иа 
2 Andrew, Dorothy M., Donald G. Paterson, and Howard P. d 1946.1 

sota Clerical Test, New York: The Psychological Corporation, 1933 an 

by permission. 


MEASUREMENT OF BUSINESS EDUCATION 218 


From ; » 
Т p of these samples it is clear that this i 
Bios 15 е The short form of 200 items Ре) es ur 
E Rein à — time and the long form, which is aa 
ӨТӨН: А 
m ыру of this test is about .90. Its validity has been studied 
Tatings of ar т test has correlated from .54 to .64 with supervisors’ 
clerical ack ievement; and it correlates well with other measures т 
ing about omen: Name checking correlates with the speed of read- 
with Жен and with spelling .65; while number checking correlates 
is low, 33 7 computation about .51. Its correlation with intelligence 
Selecting e ritical reviewers of the test state that it is a usable test for 
Picking шр clerical workers and is a satisfactory instrument for 
for Bo ut students for clerical training. Its use for over 16 years 
at its ubl aie further attests its validity. Criticism is aimed only 
Volved in pow for it does not test the more complex functions in- 
parate ne upper levels of clerical work. 
Variety А percentile norms are available for men and women in a 
clerks E clerical occupations such as stenography, office machines 
> Ookkeepers, and accountants, routine clerical workers, etc. А 


Stenographic Aptitude Tests 
ialized aptitude test is the E.R.C. 
) Stenographic Aptitude Test. This 
r, consists of five parts: 

the Gettysburg Address. He 


be legible. 


A 
ducati example of a more spec 
test, wh onal Research Corporation 
? с Ose author is Walter L. Deeme 
Writes E of writing. The subject copies 
12. S fast as he can, but his writing must 
Ugh ù ord discrimination. The subject must distinguish between the 
i Se of “current” and “currant,” “advice” and t advise," ‘illu- 


4 a : : 
T nd “allusion,” “base” and “bass” when used in sentences. 
16 samples of choices between 


Te 
thr e es 34 pairs of words. Moreover, | 
Tites ords in sentences are present in the tests. Illustrations are 
m" Ph rights," and “rites”; "sight," “site,” and “cite ; etc. 
fre Sire. netic spelling. Fifty phonograms must be spelled out correctly. 
ee Vo а few samples: іпјег, kawf, awt, skeem, hoom. 
y chog bulary. There are 50 words in short sentences to be defined 
ER Sing from five others the meaning of the word in question. For 


5. È a flitch of bacon is to be defined. 
. The Station. Sentences are dictated at a specified rate. . 

Since reliability is not reported. The author of the test believes that 
Howey * validity has been proved satisfactory the reliability must be. 
Urthe © this is a fallacy because the reliability would show whether 
Sider} improvement ms necessary. If the reliability were .75, con- 
More ; * improvement would be possible. Jf it were .93, hardly any 

Provement could take place. 


216 PROBLEMS OF MEASUREMENT 


— nd 
Its validity has been well established by correlating it with quem 
achievement (r .65) and with accuracy of transcription o! Los 
(r .70). The test is more exactly a shorthand test than one o кее 
phy since it omits several aspects of Stenography. Giving an SE TE 
the test offers a few difficulties. The material for dictation e" 
given at a defined rate which takes practice to administer co nr 
The scoring is tedious because the Scorer must count e oret ‘id 
omitted, inserted, or substituted. Furthermore, the test’s np = ols: 
been demonstrated in grades 11 and 12 but not in secretarial s e 
One of its most unique characteristics is a table of predictions. Su Ta 
with scores from 345 to 245 (the subjects within a moderate r 


о 
have the scores they will most probably achieve after the passage 
two years. 

"There are several other tests of sten 
will be mentioned brief 
seven divisions: 

1. Stroking—speed of drawing short lines 

2. Spelling—select one or none of three words (45 words) 
3. Phonetic association—serten, setl, eksit (60 associations) 


(six 
4. Symbol transcription—substitution of symbols for letters 
Sentences) 


se 
ographic aptitude. Three pr 
Y. The Turse Shorthand Aptitude Tes 


ina paragraph idera- 
This test is well constructed and standardized and has had consi 
ests d 


tion. Critical eva] ment 
follow in this chapter appear in the Nineteen Forty Mental Measure 
Yearbook and the Thi 


CLERICAL ACHIEVEMENT Tests 


Achievement in Stenography 
The construction of achievem, 
lated both by the schools whe 
effect and by businessme 


1 who wished 
phers. More lately work 


imu 
ent tests in stenography has b pe in 
Те good standards of cs IMP 
to employ competent 5 usually 
ers in the Army developed what were me 
designated as “examinations” which usually required more кт ost 
their administration. An example of the first type is the Turse- 


E. MEASUREMENT OF BUSINESS EDUCATION 277 

*horthan 1 

Elm Test; of the second, Stenographic Test 
Business Entrance Test; of the third, Examination in 


Gregg Shorthand. 

na 

Bit. "Pen tests, measures of actual p 
nui ee is achieved in a variety of ways. Printed words are 
Sentences 1 in shorthand, shorthand is transcribed into longhand, and 
Printed a are to be completed by a choice from several 
Words is t T In some tests а printed article of two or three hundred 
eft for bad e written in shorthand outlines on lines above the print 
Nord dens purpose. In one or two tests, syllabication, English аЙ. 

a s . H + 1 
апа шшс added. But in all these tests actual dictation is taken 
ne P Е 

тапу a is the Hiett Stenography Test (Gregg) which includes 
Parts: e procedures just described. This test is divided into five 


WS 

| d printed words to 

Twe shorthand symbol 

containe Frej sentences written in $ 
in four printed words 

the shorthand outlines to be 


An article of 200 printed words, 

line left for that purpose 

) and longhand transcription 
5,296 students in 358 schools 
.15. Some of the shorthand 


e directions could 


erformance play a prominent 


be reproduced in shorthand 


s to be transcribed 
horthand the completion of which is 


cer dictation (3 minutes 
after а С are available based оп testing 
Bh, Уа eme, The reliability is low, 
js lite db for correcting are hard to read, and th 
n arer. 
Work jet achievement test suita 
ent es Examination in Gregg 
Secured in three sections. 
be written in shorthand. 


ect] 
ection A. 175 printed words and phrases to in sh 
han n B. Shorthand reading test. The subject transcribes into long- 


d 
cae words. 
(1) 59 22 C. Three letters are ta 


E 
rs ihn per minute, (2) 60 w 
°F tj е. The rate is controlled by printed ma 


min 

fige - 

the „© test, printed in 1944, ha. of reliability or validity that 
951). However, percentile 

actically all the major 

ntained in the test. Its 


e first year of stenographic 


ble for th 
sures of achieve- 


Shorthand. Mea 


ken at three different rates of speed: 
ords per minute, and (3) 70 words 
terial which is marked 


s no study 
esent time 1 
fu, Ples chasers and pr 
“ther y contained in the Gregg manual are co 
bug; en к is recommended. | | | 
Sinegs we turn to the testing of sufficient proficiency for entrance into 

» the Stenographic Test, United-NOMA Business Entrance 


278 PROBLEMS OF MEASUREMENT 


and 
Tests come immediately to mind. In these tests qmm m 
competent teachers have combined their efforts to — pen 
office conditions. They have made the test long чив tox. 
ample coverage of the skills involved in a realistic office n 5 minutes 
In this test 30 minutes are given over to dictation, wit sop tee 
allowed for extra dictation. There are also allowed 90 кы a ud 
scription. Nine letters are to be transcribed in mailable n E is anew 
straight matter to be typed in the form of a first draft. TI es Jd hadie 
edition each year. Percentile norms for the year are furnishe bern 
and business firms. Its reliability is adequate, .90. Some forn ularly 
been tried on high school graduates and on those who are vip be- 
employed as typists. The high school students were more P nd to 
come confused during the latter part of the test. Some of them emem- 
finish the long assignment or else jumbled their work. It will be r 


` i andar 
bered that these tests are given at regular times only under st 
conditions by designated testers. 


Achievement in Typing 


‚ one of 
Like achievement in stenography there are two types of tests is t 
these indicates progress toward a less ambitious goal after eee suf- 
subject for a year or two; the second indicates an achievemen? fce. 
ficiently advanced for the subject to enter directly into a busines Ü Edu- 
Representing the first type might be mentioned the Commercia 


trend of 
cation Survey Tests. These tests illustrate well the general 
achievement tests in t 


"anior tyP& 
yping. They are divided into (1) junior (УР. 

writing, first year, 95 minutes; and (2) senior typewriting, M cie tests: 
120 minutes. The test for junior typewriting is composed of fiv 
Test I. Standard stroking test 


most 
Part A. 411 words, 73 per cent from Horne’s list of {5 
Common words, 5 minutes jnutes 
Part В. 407 Words, 70 per cent from Horne's list, 5 "iting í 
Test II. Business-letter test—following instructions іп W" 
standard business letter, 25 minutes iter; 15 
Test III. Completion test—25 uses of parts of the typeW? 
minutes ; poe 
Test IV. A placement test—mechanics involved in placing ® 
on a page 


to 
A , Ја; 5 
Test V. Centering test—names of twelve of Shakespeare's P y 
be typed on a page 


The senior test uses the fir 
table and a rough-draft test. 
one used in the junior test, 


{2 
ing ? 

st three tests and adds the bx an La 

Its letter to be copied is er ay? 

The Scoring is quite typical of 


| MEASUREMENT OF BUSINESS EDUCATION 279 
whic "" 
200 к рер tests are scored. If, in the Standard stroking test 
Word of n typed per minute without an error the score is 200. If 3 
Would mean etters is omitted then five strokes are subtracted. This 
10 E bta he a minute, so the score would be 199. For each error 
is oender ed from the total strokes per minute, etc. Thus the score 

йе on rate and accuracy. 

Usiness ы entrance into business is the dh yping Test, United-NOMA 
Dot radical] e Tests. The description of its parts will show that it is 

1 Турш ifferent from the Commercial Education Survey Test: 
2. Sms à corrected rough draft 
Sim i up а letter from a running copy 
ori tabulation on a form 
‚ Ту ple tabulation on а plain sheet 

Like TE a form letter with parts to 
tYped dae preceding test it is scored for 
instructio ter, (2) accuracy, (3) time consumed, an 
Cores we The reliability is estimated to be 

ctors ma ude both speed and accuracy. Separate norms for these two 
Cores c ight be useful under certain conditions. Its norms are percentile 
t Nee Apc {ог the year of the testing. These are sent to the 
conditio and to employers. The tests are administered under standard 
ficiency ns and sent to a central office for correction. Certificates of pro- 
HMost are sent to those who satisfy certain minimum requirements. 
like of the other tests which are noW listed are constructed much 


th $ 
* two just described. 


of paper 

be filled in 

(1) form and arrangement of 
d (4) ability to follow 
.90. Composite total 


MISTS or TESTS IN STENOGRAPHY AND TYPEWRITING 
8-12 and adults. 1933-1946. One form. 


inutes. Authors: Dorothy M. 
G. Paterson, and 
aff (see text). Psy- 
ion, New York. 

tude Exami- 
One 


" Sr. 
ENO, 
GRAPHIC APTITUDE TESTS 


l.« 
[uw Sten Р 
K 16. 1939. Phic Aptitude Test, grades 
for ennett ne form. Author: George 
enti. ^ No validity coefficient for 


Howard P. Longst 
chological Corporat 
5. Detroi 


Cop ER. 

Тыр Oration (Educational Research 
Miny Btades Stenographic Aptitude 
te Utes, дү and over. 1944. Time; 


ху SA 
©, un : Walter L. Deemer (see 
4. 80. Ce Research Associate? 


inne 
Sota Clerical Test, grades 


nation, high school. 
form. Time: 30 minutes. Authors: 
Harry Ј. Baker and Paul L. Voelker. 


Public School Publishing Company, 


Bloomington, ш. 
п. STENOGRAPHIC ACHIEVEMENT TESTS 
je Examination in Gregg Shorthand, 
year high school. 1944. Form В. 
Authors: Examina- 


minutes. 
Armed Forces 


fi of the U.S. 
Cooperative Test Service, 


280 PROBLEMS OF 


2. Hiett Stenography Test (Gregg), 
high school. 1938-1939. Forms B and C. 
Two levels. Time: 40 minutes. Authors: 
Victor C. Hiett and H. E. Schrammel 
(see text). Bureau of Educational Meas- 
urements, Kansas State Teachers Col- 
lege, Emporia, Kans. 

3. SRA Dictation Skills, high school 
and adults. 1947. Six 12-inch records: 
two for accuracy, four for speed. 
Authors: Marion W. Richardson and 
Ruth A. Pedersen. Science Research Аз- 
sociates, Chicago. 

4. Stenographic Test, United-NOMA 
Business Entrance Tests, schools and 
industry. New form each year. Authors: 
Joint Committee on Tests, United Busi- 
ness. Educational Association 
NOMA. National Offi 
Association, New York. 

5. Turse-Durost Shorthand Achieve- 
ment Test, Gregg dictation, 1-2 years in 
high school. 1941-1942, Time: 60 min- 
utes. Authors: Paul L, Turse and Walter 


N. Durost. World Book Company, 
Yonkers, N.Y, 


6. Blackstone 


and 
ce Management 


| Stenographic Profi- 
ciency Tests, commercial schools or 


business firms. One form. Time: 50 
minutes. Author: Е, G. Blackstone. 
Psychological Corporation, New York, 


MEASUREMENT 


III. AcHIEVEMENT TESTS 
ОЕ TYPEWRITING 


1. Examination in Tue exa 
and second years high school. 194 ү, 
form. Two levels. First year secon cece 
school, 130 minutes; Second Me ahs 
ondary school, 115 minutes. AR 
tion Staff of U.S. Armed vice, 
Institute. Cooperative Test Ser 
New York. igh 

2. Kauzer Typewriting Test, Pa 
school. 1934. Three levels: first aie 
second semester, and fourth 8 laide 
Time: 15-25 minutes. Authors: А pui 
Kauzer and H. E. Schrammel. Br ansas 
Educational Measurement, | Kans- 
State Teachers College, Tp A Busi- 

3. Typing Test—United-NOMA В 
ness Entrance Tests, school eH year. 
try. 1939-1947. New form each yow 
Authors: Joint Committee | Associa- 
United Business Educational Man- 
tion and NOMA. National poses 
agement Association, New Vorsurvey 

4. Commercial Education. levels: 
Tests, high school. One form. T ag 
Junior typewriting, first year, secon 
minutes; senior typewriting, Jane E. 
year, 120-130 minutes. Author: Com- 
Clem. Public School Publishing 
pany, Bloomington, Ill. 


: у ts an 
> are aimed directly at vocational competence. Tes 
measures give u 


an ability to make accurat 
firm or business, Th 


d 
an 
these the Examination in Bookkeeping test 
newest and most complete.! It now ba test 
for the first year of book, " Js 
folto 
Into four parts whose purposes are as t$; 
1. Section A tests knowleq р рр {ас 


ME. V 
ASUREMENT OF BUSINESS EDUCATION 281 


as [11 e 
A Dope ledger,” “budget,” ‘“drawee,” “single proprietorship,” 
Be мі — “petty cash fund,” “net profit,” anaes (ate 
2. на, Ej worth,” ‘net loss,” “operating expense ? etc zm 
dosing ae tests understanding of the method of adjustin, and 
listed Бро, accounts. The directions say: “For each of the dto 
ейтен 9 which account should be debited and which should 
сш is sh ow your choice in each case by writing the letters of the 
e proper spaces on the answer sheet." The answer sheet 


15 ве 
Parate from the test. 
Accounts 


N. Sales equipment 
Q. Store supplies 
P. Store supplies used 


Int 
Ма receivable 
= Chandise inventory 
Paid insurance 


4. B 
; Bad debts 
- elivery ae I. Profit and loss summary 
M EGTA реп J. Proprietor's drawing account 
n Expired e en K. Purchases 
v Interest e L. Reserve for bad debts 
a e M. Reserve for depreciation of delivery 
П. 


Example 


To 

record 
Cred; the i А ко а 
йа, "he insurance expired: Expired iusurance is d 
Чт Look en D has been placed in the debit column and 
at the answer sheet to see how this has been written. 


ebited. Prepaid insurance is 
H in the credit 


Answer Sheet 


Db Cr 


Te 
Ten s 
1 ate | . 
tems are ments are to be analyzed in t wo following 
47 examples: 

. SIR . 
51 9 reco, 

йш: Pn: the ending merchandise in 

interest accrued on notes 


he same мау. The t 


ventory. 
„8 S receivable. 
iny есы g А s 
A boo tion C tests skill in analyzing and recording bookkeeping entries 


S Ghats 
original and final entry. 
a lumber merchant. 


несу; 
Th yo tions: A ss © 
Ssume you are the bookkeeper for W Шин апе, 
{ the following: 


Ur a. 
ns 
Wer booklet you will find sections o! 
Ledgers 


Say Journals 
p, Тоц Page Page 
Ge eee nal General Ledger 4&5 
Journal Accounts Receivable Ledger 6 
Accounts Payable Ledger 6 


Der. 

Сав i ourna] 
üsh iia Journal 
€nts Journal 


оз оз t2 в кә 


282 PROBLEMS OF MEASUREMENT 


" " igi ntry 
Step I. Record the following transactions in the proper books of original entry; 
which are on pages 2 and 3 of your booklet. 


31, 
Then there follow 13 transactions, dated between August 1 and 
of which the three following are examples: 


August 2. He paid $120 cash for August rent. А 600; 

August 14. Sold lumber on account to F. C. Mann, 406 Maple Ave. City. $ 
terms, 2/10, n/30 

August 31. Received $750 from cash sales of lumber, August 1 to 31. 


The test then continues as follows: 


4-6 of 
Step II. Post the journal entries to the proper ledger accounts on P to the 
your answer booklet. 'The student is warned to (a) post the individual entrie 


r jour 
correct accounts (b) total and rule the proper journals and (c) post the prope J 
nal column balances to the correct accounts. 


4. Section D tests skill in preparing a ten-column work sheet. 


Directions: On page 7 of your answer booklet you will find a eee a 
sheet which you are to complete. The account names and the trial balance accounts 
are listed on the work Sheet. These accounts have ло connection with the acc 


В a STE below 
used in Section C. The necessary information for the adjustments is given 
on this page. 


In preparing the work Sheet, you are to: 
А. Enter the ne 
B. Complete th 


work 
unts 


t. 
9 e shee! 
adjustments” column of thi 


отеп 
($10) F ($20), depreciation on office a 
» accrued salaries payable, etc. The work sheet is to be adj 
after these entries are made. 
While there are по reliabilit: 
ures well the ordi 


«4 meas” 
y and validity studies of this test, it ™ 
nary proced: 
3 hours) may be 


time 
ures used in bookkeeping. Its length M 


necessary to measure actual performance. En- 
'The second test —Bookkeer; P :ness Ё 
eeping T ited- A Busin А 
trance Tests—is, as та ping Lest, United-NOM. ; 


à mat 
àme implies, meant to provide infor 


à 55 

7 SF th the United Business Education ^ me 
ation and the Nationa] Office Managers pera Fitness for 1 ples 
diate employment is indicated by (1) the understanding of the pane (3) 
and practice of bookkeeping, (2) ability to follow instructions, 2? 


ME , 
E. ASUREMENT OF BUSINESS EDUCATION 283 
neatness. The test i 
нр st involves (1) a correction of th i 
€ П ‹ e incorrect entri 
Eds a new ne = (2) correction of the incorrect E 
ун alance in the general led a 
e 1 i ger, etc. Som e 
he ER errors is more like accounting than зе 
Кез. that 2 ever, argue that “if he can 1 = 
Е ocate and co i 
This test pmi that he can also do the original d" NC LÀ 
not entirely o 3 an estimated reliability of .90. The scoring of the test i 
scale, Prom о since it must be rated for neatness on а vines m 
Standpoint of с results of this test certificates are issued. AEN 
Way because А е teacher this test is of little value except in а eee 
office. e test is administered by experts and scored in a central 


Sever, 
al other tests of bookkeeping appear in the following list: 


LIST OF TESTS OF BOOKKEEPING 


Schools Tests for Indiana, first, second 


1, Sh 
етуу ; 
well-Whitcraft Bookkeeping 
emesters. 1942-1945. New 


Test 
hi 
crs,” Eh school, first and second semes- 


1937- 

suae: ie forms. Two levels. 
S emwel] minutes. Authors: E. С. 
chrann a! E. Whitcraft, and H. E. 
ССазигетде Bureau of Educational 
ollege 8, Kansas State Teachers 
. а рона, Kans. 
dccounting, high in Bookkeeping and 
le’ form. Ti high school. 1944-1945. 
ЖӨ els, first ime: 180-190 minutes. Two 
(19. Beads secondary school (1944) 
ba [Seen year secondary school 
ples accounti n A, knowledge of impor- 
in 5 40 min ing terms, facts, and princi- 
of the utes; Section B, understand- 

Sor, еду of adjusting an 
Cting рс n accounts (credit ог debit, 
(ON Bb accounts in double 
kee kill in an ping) 20 minutes; Section 
fining ent alyzing and recording book- 
in M entry VER in books of original ап 
45 P'eparin minutes; Section D, skill 
D Чашев Аш 10-column work sheet, 
tive Armeq slain Examining Staff of 
Reg est Ser ‘orces Institute. Coopera- 

3 ard Boo New York, or Science 

* Boo e EARTH Chicago. 

1 ping Tests, State High 


Ne See Th; 
ЎЎ hird Mental Measurements Ye 


TUnswigk N 1 
, N.J.: Rutgers University Press, 


and fourth s 
forms scheduled for each year. Time: 
40-45 minutes. Authors: M. E. Stude- 
baker, B. M. Swinford, V. H. Carmi- 
chael, F. R. Botsford, and R. Burkheart. 
State High School Testing Service 
Purdue University, Lafayette, Ind. , 
4, Breidenbaugh Bookkeeping Tests 
high school. 1936. One form. Four levels. 
Single-proprietorship high school book- 
keeping course. Test 1, first half of 
course, nontimed (50-60 minutes); test 
2, first half of course, nontimed (50-60 
minutes); Test 3, second half of course 
30-60 minutes), Test 4, 
of course, nontimed (100 
minutes). Journalizing, adjustments, 
balance sheet, statement of profit and 
loss, closing entries, and worksheet. 
Author: V. Е. Breidenbaugh. Public 
School Publishing Company, Blooming- 
ton, Ill. 
5. Bookkeeping Test, United-NOMA 
school and 


Business Entrance Tests, 
й 1939-1947. New form each 
y + 120-130 minutes. 
Authors: ittee on Tests, 
United Business 
arbook (Oscar K. Buros, ed.), Item 368. 


949. 


284 PROBLEMS OF 
tion and NOMA. National Office Man- 
agement Association, New York. 

6. Elwell-Fowlkes Bookkeeping Test, 
high school. One form. Two levels, to be 
used at end of first and second semester's 


MEASUREMENT 


eral theory, journalizing, adjusting еп 
tries, closing the ledger, and брат 
statements. Tests have uc UT 
diagnostic value. Authors: F. H. cw 
and J. G. Fowlkes. World Book 


work. Time: 60 minutes. Measures gen- рапу, Yonkers, N.Y. 

There are two other types of work which might be classified ded 
pendent on skill: filing and machine calculation. In each of паре of 
satisfactory tests have been constructed by the testing committe Е 
United Business Educational Association and National Office mms A 
ment Association. Their names are (1) Filing Test, United-NO} м 
Business Entrance Tests, 1939-1947. and (2) Machine Calculat an 
United-NOMA Business Entrance Tests, 1939-1947. For a comp al 
score in each of these tests their test scores are combined with those 0 


à іс 15 
the Business Fundamentals and General Information Test whicl 
described in the next section. 


CONTENT TESTS 
Under content tests are included: 
- General tests of business information 
- Business English 
- Commercial or business arithmetic 
- Commercial law 
- Economic geography 
- Interest in business 


r 
Several aspects of bookkeeping and accounting would also fall unde 
this heading, 


: Under Item 1 are usually included tests of information which eae 
in a business office need. Spelling, punctuation elementary Dum 
uS Some knowledge of current events are included. The United-NO^ ee 
Series of Business Entrance Tests includes such a test in the mq 
ments for Certificates in typewriting stenography bookkeeping: = 
An illustration of a somewhat different test is the General Test c st 
ness Information (see list) which is suitable for grades 9 to 16. This 
includes questions аЬ 


oO onu ооо к 


S as “C.O.D.” and about the frequency е 
The test claims to cover the pae 
nformation that a high school oF sin 
is also some opportunity for diag” 


MEASUREMENT OF BUSINESS EDUCATION 285 


dicated by a coefücient of 91. Its 
bject matter contained in textbooks 


h items to critics in the field. 
re, is the Business Fundamentals 


Es reliability of this test is in 
К ity was checked against the su 
Syllabuses and by submitting suc 


А e г 
€ банн. test, very different in natu 
eneral Information Test of the United-NOMA Business Entrance 


p This test is not intended for diagnosis and remedial treatment 
ands кте ргойсїепсу їп business. It tests grammar, punctuation, 
tion T ing along with f undamentals in arithmetic and general informa- 
news sually accumulated irom listening to the radio and reading the 

е aoe Its reliability is estimated from a previous test made after 
ing fy me manner and having reliabilities indicated by coefficients rang- 
with ras .75 to .84. Its validity is assured by the intimate acquaintance 
ñ e field of its constructors, who are à combination of teachers 
the bi а of office workers. No careful study ius been made of 
skill rrelation between success ON this test combined with a test of 


ofeg tenography, typewriting, etc.) and subsequent success In an 
ests constructed by the 
ute is Examination in 
a test of considerable 
tunity to cover the 


А 
5 for business English, one of the needed t 


бв ; | 
gg, ibn Staff of the Armed Forces Instit 
length ( English at the high school level. It is 

(testing time, 2 hours) which offers an oppor 


topic 

thorough] ions: 

S Я ar e sections: А 
тта Веј words from а list of 100 words 


ecti 
sential Г The selection 0 ud 
al to ordinary business communication. В 
bug, Чоп II et. usage—25 pairs of words frequently confused in 
Dess eg. irs OF) accede” and "exceed. 
S, e.g., “principal” and “principle, a : : 
of bet III. Twenty matters of form and usage—address, wording 
Pes let А li tary close, etc. 
imentary " 
ip, UBND, ашан — age. The subject must discover 
ge. 


есі 
Such ОЗ IV. A test of grammar а 


errors in $ 
‚ Sect entences. " . tes 
tiya On V. Th ters which are to te ^h id 
tig, SS: These e a phin, (2)a reply to a request T m 
(1) in (3) a recommendation- Each sentence 15 written 1n pia qm 
anq Re lively but crude, (2) one affected and wordy, ole (3) еі жез 
Seng, Scere. The subj 't must choose one of the ее 1 tests 
len, Се. U subject T е no reliability reported but the te 
th. P to the present there 15 ility. Norms based on 1,200 


t recognition of effec- 


Сазы 15 assu i <e tory reliabil 
аду" аге bei acini satisfac m re calculated for both the parts 
ing improved. Since nor for analyzing errors which 


° е " 
fur. total, there is some opportum y 


Th 
боор, 


the | ic, bu 
area i arithmetic, 
s of business eluded 


ај 
Phy three tests аге simply incu 


286 


PROBLEMS OF MEASUREMENT 


TESTS OF GENERAL BUSINESS CONTENT 


1. General Test of Business Informa- 
tion, grades 9—16. 1942-1943. Forms A 
and B. Time: 40-45 minutes. Author: 
Stephen J. Turille. Bureau of Educa- 
tional Measurements, Kansas State 
Teachers College Emporia, Kans. 

2. Business Fundamentals and Gen- 
eral Information Test, United-NOMA 
Business Entrance Tests, schools and 
industry. 1939-1947. New Test each 
year. Time: 45-55 minutes. Authors: 
Joint Committee on Tests representing 
United Business Educational Associa- 
tion and NOMA. National Office Man- 
agement Association, New York. 

3. Cooperative Commercial Arith- 
metic Test, first and second semesters, 


tive Test Service, New York, ү 
Science Research Associates, CE dn 

5. Examination in Business a 
high school level, grades 11-1 "бше 
Form B. Separate answer sheets. aas 
120-125 minutes. Authors: Exa ТЕ 
tion Staff of the U.S. Armed Forces о 
stitute. Cooperative Test Servic "ates, 
York, and Science Research Asso 
Chicago. 

6. Parke Commercial Law 


Test, high 
school. 1933. One form. Time: 


Р eau of 
minutes. Author: L. A. Parke. Br ааа 
Educational Measurements, Kans- 


State Teachers College, a Test, 
7. Primary Business TUE form. 
grades 9-15 and adults. 1942. 


all. 
1944-1947. Forms U and X. Separate — Nontimed. Author: Alfred J. Eu 
answer sheets. Time: 40-45 minutes. Science Research Associates, Test; 
Cooperative Test Service, New York. 8. Tate Economic Geography 1940. 
4. Examination in Business Arith- high school level, grades 9-1 duca- 
metic, high school. 1944, Form B. Sepa- Time: 50-55 minutes. Bureau © Stat 
rate answer sheets. Time: 135-145 min- tional Measurements, Kansas 5 
utes. Authors: Examination Staff of the Teachers College, Emporia, Kana: 
U.S. Armed Forces Instit 


ute. Coopera- 


SUMMARY 


measured by tests of а 
vocabulary, etc. Ас 


Ce—takin 
tering or 
nt sampl: 


g and transcribing dictation, bo 
correcting actual items in а ch was 
ed the general information W 


a Office ?nagement Association. Their tests, 
under standard conditions, indicate the proficiency necessary to 
directly into clerical work, 


MEASUREMENT OF BUSINESS EDUCATION 


287 


QUESTIONS AND EXERCISES 


1 C 
A сорав the emphasis of instruc- 
Шы ш consumer-education class with 
ыга. a class preparing to enter 
2. D : 
are iei es the type of items which 
aptitude. Т in a test of stenographic 
test be art what uses can an aptitude 
. How is į 3 
qm Ow is it possible to validate an 
ement test? 

о Р š 
aptitude do the types of items in an 
achiey, test differ from those in an 

5 ek test? 
ing а can it be said that bookkeep- 
Ves both skill and content? 


ach 


6. Explain and illustrate the differ- 
ence between a test of skill and one of 
content. 

7. What are three characteristics of 
the tests constructed by the Examina- 
tion Staff of the U.S. Armed Forces 
Institute? 

8. What are the general purposes of 
the United-NOMA Business Entrance 
Series? What characteristic makes them 
of small use to the classroom teacher? 

9. What are other functions of 
stenographers in addition to taking and 
transcribing dictation? 


BIBLIOGRAPHY 


Ü ANDERSO 
- tical 


tong ( 


Test p N. "Review of 

sts (1929-1942)," Оссира- 

АЗ) 21:654-660. н 

thieve, Dororny M.: “Prediction 

Сару ie in Type-writing and 

та] T. in a Liberal Arts College,” 
630 Applied Psychology (1946) 


eni 
оц 


INGH 
t AM, Warrer VAN Dyxe: Apti- 


ude 

Xu, хш Aptitude Testing, Chaps- 
Татре & » pp. 322-329. New York, 
Brag Brothers, 1937. 


Ra, САС: 
Education Se E. G. “Commercial 
т PS Encyclopedia of Educo- 
e Ма йге, pp. 426-440. New York: 
aos, Шад Company, 1941. 
aenal jy Oscar К. (ей): The Third 
35-396 casurements Yearbook, Items 
J.: Rut 623-632. New Brunswick, 
e University Press, 1949. 
lygi remer he Nineteen Forty Mental 
The 1664-15 Yearbook, Items 1416- 
194; Menta] 665. Highland Park, N-J-: 
1. Measurements Vearbook, 


nis. 

Ni з 1938 Mental Measure 
шз” Doob, Items 935-945. New 
> N.J.: Rutgers University 


NE, 

NSEN Harry A, Amert N 

Sat Ма and J. Raymonp GER 

ye or. тетет and Evaluation i^ 
У School, Chap. ХХІ. New 


York: Longmans, Green & Co; Ino 


1943. 
Huser, RUSSELL J.: * Aptitude Test- 


ing in Shorthand,” Journal of Business 
Education (1941) 22:25. 

JURGENSEN, Сілғғокр E.: “A Test 
for Selecting and Training Industrial 
Typists,” Educational and Psychological 
Measurement (1942) 2:409-425. 

KLUGMAN, SAMUEL F.: “Test Scores 
for Clerical Aptitude and Interests 
before and after а Year of Schooling," 
Journal of Genetic Psychology (1944) 


кт S. “Ап Experi- 
{ the Theory of Inde- 


tional Psychology (19: 
SCHNEIDLER, 
for the Minnesota Voca- 


tional Tes 
cational an 


(1941) 1:143-156. 
—— and DONALD G. PATERSON: 
ces in Clerical Aptitude," 


sf Differen: 
n nal Psychology (1942) 


Journal of Education 

:303-309. 
Е HERBERT А.: Consumer Edu- 
{ Chap. 8. New York: 


Ci; 3 
T. “Problems in Short- 

i Business 
hand Prognosis," Journal of 
Education (1938) 1347-18. 


CHAPTER 12 


Measurement of Fine Arts and Manual Arts 


ether in 
"These two areas of fine arts and manual arts are grouped tog 


к be- 
: ; ; in affinity 
part for convenience and in part because there is a certain 


a 
а mt o manu? 
tween them. Performance in music and art is directly related t 

facility, while much of the suc 


istic 

s artist! 
cess in manual arts is due to the 
manner in which the object is constructed. 


In this chapter, we shall consid 
(1) music, (2) art, and (3) 
economics. 


ion of 
ation 
er the measurement and ge ho 
manual and mechanical arts & 


MUSIC 


k ast 9 
practically universal. What was in the et o 
people foregathered in concert hall, ор t 


The world of music is 
rather select affair where 
academy of music has no 


эрме ве 
ese conditions the school has no other cour 
that of introducing i 


There are two major as 
(1) the measurement of a 


MEASUREMENT OF TALENT IN 


The measurement of 
experimenta] w 


SIC 
OR APrrTUDE ror MU ^ 
akes its beginning from 


the 
expe". 

Shore who, after years 0 o 
ca 


0, 


hol 
19 under the title The Pun si | 
forth both an analysis p usic? 


5 
Е s he 

к ‚ Said he, of (1) the sense of pitch, (2) the gens? 

of intensity, (3) the sense the sense of rhythm, (5) these 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 289 


mportantly, described the 


appara ore 
DP: tus and second, and perhaps m i 
ctness were impressed the 


panogah records on which with minute exa 
a аи 
of e nme of the tests was published in Seashore's Measures 
features al eee revised edition. These new tests embodied the main 
i asse. Т) e original test changing only the test of consonance to one 
. The revised edition calls these divisions pitch, loudness, time, 


tonal f 
l memory, timbre, and rhythm. It also has two series, A and B. 
of unselected groups of children 


eries A isi A 
or =, T is intended to test the capacities 

ults. Series В measures the capacities of more specialized groups 

ici The test for each series is 


Such de 
eu, musicians and prospective musicians. 
ed on three 12-inch phonograph records with а complete test on 


cach side of the record. 
nts? Is it reliable? Does it really 
tones wh: clear: ^ You will hear two 
ig im, en differ in pitch. You are to judge whether the second is 
record s lower than the first. If the second is 215 record H; if lower, 
reliabitin, It is generally added “If you are not sure, guess. The 
which ity of these measures 15 indicated by coefficients of correlation 
ig ES the individual tests vary from .62 to .89. The coefficients are 
for tonal memory, pitch, and loudness. The constructor recom- 


Mend: 
s 
t that these six scores nO total score but 


t be combin: t с 
басһ one be treated as а separate entity in the formation of а 
e pro 


vided for gr 
es to these mea 


ncipl ШЕ 
h to discrimin 


ity Sdn we apply our strictest pri 
арьс that they are not re 
of e present in the same indi 
hha sem Апош, or less artificial manner, 


er th 

Perat, зе Measures athered › s 
wie in music as the [a under ng conditions. 

- is indicated by the corre 


Uses Rie swer to this last question 18 ! 
9f the t ch are now introduced and toget, 
i Boc validity. About the only criterion нт 
€ the tests of musical talent jg succe n cou ini 
Pract; more theoretical courses in harmony or counterpo мави 
dete ‘cal courses dealin with instruments: Success in such courses is 
train; ed by remm 4 interest ambition; intelligence, and previous 
'| musical talent. For this reason, the 
lent and success 


ose a coefficient 
amental question is 


wh, 
0) Sth 


In, 
Сор, S а Wi I 
relay: ell as b fundamenta : 
Me lation coeffici | b en measures of musical p : : 
ficients betwe ;gh. In reviewing 16 studies which 
orth! reports the 


hag tes i 
n i ery hl 
a ееп music has not be s (1931) Farnsw 
the 2 completed up to that Er | d Experimental Stud f 
™Msw — itical апі Experimenta лау о 
fg orth, P. R, “Ап pristorical, Lees psychology шай (1931) 


9.94, е 
329, “ash 
91-3007 Kwalwasser Test Battery 


290 PROBLEMS OF MEASUREMENT 


" r " —.08 to .45. 

rrelations with school marks in music as varying from eek 
When each of the measured traits of the Seashore tests is cor -— 
school marks after one semester the following correlations r 


ОВ уйсыз. ны» e 11 
PEON SUEY на еее .07 
е p ЖР .20 
Consonence. м... мууз = 2{ 
Tonal memory............ =, 19) 
ARDY EAT S. часка диз 25 


elations 
In general, these results are much lower than the ves inspec- 
found. The trend can be more readily inferred from Table 9. 


* 
TABLE 9. PREDICTING SUCCESS IN THE Srupy or Music 


; Sight Achieve- : 
ан Me singing, Me- ment in e r 
епі m -.. |еаг train- | |. applied dia 
musical |dian r opm dian r de 

› m 
theory dictation m 
23 
.63 | * 

Те, e| -13-.56 | .29 | .02-.56 | .29 "s PE 

Oct T NNNM e| .03-.64 | .38 | .03-.64 | .54 “a 19 

Tonal memory... zj -16-.70 | .36 | .23-.70 | 57 0 SE Nes 

WRT К -05-.40 | .30 | .05-.40 | .30 | .07—. 25 | 30 

Rhythm... 14-.39 | 21 | 14—39] 21 p^ 06 

Consonance.............|..05-.37 | og E um | ee | 

Total scores, Seashore. e| .21-.75 | 44 -40-.70 | .46 | — Ae .33 

Mental-ability tests... -23-.66 | .41 | 23—64 .29 :03-.3 

chnical 
* Predicting Success in the Study of M: "sic, Veterans Administration Те 
Bulletin TB 7-77, Dec. 21, 1947, 


success in musical theo he tests of musical talent. bee eff- 
turn to sight singing, ear training, and dictation the tests are mo 57. It 
cient. The median coeffici 5 in this instance range from .21 {0% ut 


1 Mursell, James L., The Psy, 


ortoD ^ 
chology of Music. New York: W. W- N 
Company, 1937. 


ME. EMEN 
ASUREMENT OF FINE ARTS AND MANUAL ARTS 291 


above а combination of the six in a 


total А 
em Eee к tests, too, are far below the talent tests in the 
E ecl singing and ear training. When we consider applied music 
m Eno | acm tests or their combination furnish any real aid 
Ee wr . The tests of intensity and consonance have no more than 
CEN vraie correlations with marks in applied music. Time, the 
ability em a coefficient of .23 shows only a low relationship. Mental- 
еза ie a coefficient of .33 are distinctly more closely related to 
ron in e area of practical music than are Seashore’s tests. 
qm e previous discussion, it may be inferred that for predicting 
tests mi m some combination of intelligence tests and musical 
exten ded it be better than either alone. We are fortunate in having an 
eading Lares qe of the combination of the Iowa Comprehension 
standard est and the Seashore tests in predicting musical success in à 
E musical college." In this study it was established through 
student ту investigations that it was practical to divide the entering 
s into буе groups—(1) safe, (2) probable, 


ul, : (3) possible, (4) doubt- 
and (5) discouraged—on the basis of their standings in the two tests 
ical Talent and Iowa 


e 
ing). ae Measures of Musi Test of Silent Read- 
couraged е students were very low on both tests they were to be dis- 
eys in their intention to proceed with their musical education. If 
cored very high on both, then they were safe as far as their pros- 
following data show the 


рес 
S for success and graduation went. The 
ved by each group: 


Probabil; 
bility of graduation achie 
Percent of graduated 


cl 
early above the others and even 


Group N 
IRfes. stadi oa ae ME 125 60 
143 42 
Possible... 195 33 
73 23 


Doubtful... eee 
29 17 


Discouraged. -+1737 


Fu 
» ив the students with high scores stayed in school longer; had 
pear ismissals, gathered in more of the honors, and made more recital 
owe than did those who received low scores. Tt seemed clear 
Рас is combination of intelligence tes sical-talent test was а 
lcal success for selecting students for advanced musical ua 
able 


the 
em €T combinations which include the 


Cle ^ 

Ha, cy in icti dy a com 

nm prediction. In one 5 y \ 

1 On-Nelson Intelligence Test, and Teachers College Achievement 
„sical T: alent, Studies in the Psy- 


292 PROBLEMS OF MEASUREMENT 


А -— ; ollege 
Test correlated .84 with School marks in sight singing gei m dnb 
students.! But it must also be mentioned that е ра sight sing- 
Shore's pitch and tonal memory correlated ae with mar 3 аин 1 
ing іп the case of 131 students, while a combination o 


ан а Sea- 
Intelligence Test, Iowa High School Content Examination, an 
Shore's pitch and tonal 


usi- 
Cy of the Seashore Measures of M 


; uota- 
ison with other like measures, the following q 
tion is approximately correct? 


out those unfortunate 
out enormous effort, 


+ Tests 
Р c Tes 
musical talent, the Kwalwasser-Dykema Musi 


graph 
tests are imprinted on рдак » 
nt test uses five double-disk records, by m 


2 

3 imination 

4. Feeling for tonal movement 
5. Time discrimination 

6. Rhythm discrimination 

7. Pitch discrimination 

8. Melodic taste 

- Pitch imagery 

10. Rhythm imagery 


he Seashore Measures of Musical Talents; 
the last three are new. and 
These tests are claimed to be “indicative of musical talent P» 
achievement." In the manual (1930) norms are furnished but no i 
Р nica 
1See Predicting Success in the Study of Music, Veterans Administration Tech 
Bulletin TB7-77, Dec. 21, 194 › in which th 
combinations, Yearbooks 
2 Farnsworth, Paul R., in а review in The Third Mental Measurements 
p. 177. New Brunswick, N.J.: Rutgers Unive 


tsity Press, 1949, 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 293 


are presented on validity or reliability. The test is easily administered 
and scored. It has been rather widely used by music educators. 

À new test of music has recently (1950) appeared, Musical Aptitude 
Test, by Harvey S. Whistler and Louis P. "Thorpe.! The constructors of 
this test discard the analytic approach of Seashore and declare that 

rhythm, pitch, and melody are the basic elements of all music."? “The 
test is divided into five parts for administration: (1) rhythm recognition, 


A 
; T 
Saas === | 


Profile ks here) 
, "m (Chart percentile гоп 88189 
Possible Pupils *dR үө s 10 20 3040506070 80 90 95 


= do 30 40 506070 80 MA чө 
Fio 5 :tch discrimination, and profile. 
vd А А А ition, ріс 
Whist Musical Aptitude Test: pitch recogni 


and Thorpe, 1950.) И 
nelody recognition, 25 


. y . (3) m 
fem Y (A) piteli oe is ол A (5) advanced rhythm 


5; (4) pi iscrimination i d are re- 
тосо 2.12) pitch discrimination, 9 upon the piano an е 
"niti, ; › AI! tests are playe zing the test is 
SPongeq m TB Mes Al bet The time needed for qt be рети 
оц чне з ves of melody pi^ o | Er the st ol 
amp. es. lo p (diffe . 
; Ples the subject responds 5 same) x D 4, according to iheni 


Pite] 
ber \ recognition the subject responds tr (Consider the two samples 


fron mes а tone occurs in the :n Fig. 18.) 
. n 1g. 

€ test of pitch recognition s ^ subject responds to the two 
Mag «а. pitch discrimina ese oe oen, with S (same), H (high), 
S Presented with a two-count res ый 
ess x ms by permission. 

2 оп Test Bureau, Los Angeles: Calif. Ite 

ations from the manual. 


10 ite 


294 PROBLEMS OF MEASUREMENT 


or L (low). From the scores, a profile may be drawn, as shown in Fig 
18, on the furnished chart. . 

: There are reported in the manual the results of the studies performed 
in standardizing the test. The validity was studied by correlating 37) 
total test scores with teachers! estimates of instrumental talent (r =. en 
and of vocal talent (.56), and with whether subjects had played on ni 
instrument for 1 year (у = -56) in an orchestra or band (r — -— 
had sung in a chorus, choir, or glee club (у = .19). The last three e ae 
lations mean that there was a tendency (as indicated by the size о a 
coefficient) for subjects who had played an instrument for 1 ате z 
more to make high scores, for example, and for those who had not p. ae 
an instrument that long to make low scores. The reliability for the to 


А Тае Рег 
score 15 reported as .93 and for the three divisions as .80 to .88. 
centile norms have been calculate 


corrected so that the average I.Q. 
100 and the standard devia 
for grades 4-8 inclusive. 
when the chronological 

Here, then, is a new 


= 51С 
g their study of instrumental and vocal mu? ^ 

(2) to aid in the groupi 

(3) to advise with 


INFORMATION, APPRECIATION, AND ACHIEVEMENT Р 

«dicate 

It must be remembered that a successful achievement test pc 

the amount of progress achieved by a student toward a defined d and 

€ teaching of music were clearly prets 
lowing summary of the eighth-grade 


е bee? 
n idea of how well these objectives am laid 
Worked out. The educatio: d 


de Е а 
ividuals can be brought to sing songs 
hem in a group. 

1 Manual. 


3 ence 
* Rep ort of Educational Council of the Music Supervisors’ National Confer 
Washington, D.C.: National Education Association, 1921. 


eS M 

t 

—— 
a 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 295 


А Attainable Objectives for Grade 8 
. Ability to sing well and with enjoyment 30 to 50 (a) unison, (b) 
gs. This group includes community 


two- 
bol ү and (c) three-part son 
ational songs. About 90 per cent of individuals sing alone at least 


p these songs. 
- Ability to sing at sight, using words, а unison song of hymn-time 


grade; Н 
hus or, using syllables, а two-part song of hymn-time grade and 
Caslest three-part songs. About 30 per cent of pupils sing these songs 


individually, 
о дину to appreciate the. charm of design of songs sung; to give the 
з eatures of structure in а standard composition; to identify a 
emi song after hearing ita few times and to know the titles and 
4. En of 20 standard compositi 
Cent of owledge of essential facts of 
Pieces Rc can give correct explanat 
E o difficulty. 
in ек say more briefly 
i c may be thought of as mo 


ons. 
elementary theory so that 75 per 


ion of notational features in 


e outcomes of instruction 
ing well bet derate amounts pM Ш М) Hs. 
Songs a; (2) singing at sight, (3) appreciating | ne à ез О 
an and (4) acquiring enough knowledge of theory to give correct ex- 
мака of notational features. There have been attempts to measure 
А omes in each of these areas. For measuring the ability to sing well 
b is the Mosher Test of Indivi inging.! In this test 12 exercises 
pd in order of difficulty are to be sung by the subject and scored 
ie according to definite instructions. There 1s also E Hillbrand 
а Singing Test.? This test for grades 4 to 6 contains SIX sone: ш а 
fis ae folder. The pupil studies the 50085 sree id kr nin 
i em wi accom animent. с 
Inds of he ып made, = to be recorded on а copy of the 


Son, 
раз errors аге: 
2 motes wrongly pitched 
x qu nspositions 
d m flatted 
s, p mes sharped 
6 
ji 


that the attainabl 


otes omitted 
Trors in time 


Extra notes 
urement of Sight-singing, 


1 
Mosh м 

of Meast D dr 
ОШ ау репа M. А 560 d OT orem of Publications, Teachers 

e| о Education, =. > 
[o Нс Ба University- 5. зїп NY: World Book 
Р si 

Pus debe K., Hillbrand 518^ £ 


g Test. Yonkers, 


296 PROBLEMS OF MEASUREMENT 


8. Repetitions 
9. Hesitations 


Probably the most complete test for the knowledge of school music d 
the Kwalwasser-Ruch Test of Musical Accomplishment.! It is intende 
for grades 4 to 12. In its Construction, attempt was made to use items 
from representative courses of study. There are 10 divisions of the test: 

1. Knowledge of musical symbols and terms. To answer the items 
requires a knowledge of the tones of the scale, flats, sharps, clefs, rests: 
crescendo, dimmuendo, lento and legato. For example, 


19. Allegro means lively slow repeat accent Sweetly 


under the other notes.” 


бмк 10 
i key Signatures. Must write the names of each of 
gnatures, 


the informational and factual sides of 

CEN Music Supervisors? National Conferenc? _ 

epp не of facts is related to intelligence almost аѕ id not 

necessaril LR А Person who scores high on this test wow 
aruy stand high in musical accomplishment, 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 297 


TESTS Or INFORMATION AND APPRECIATION 


E Mp eie of interest in music can be had from one of the divisions 
acquaintance reference Record. It is also somewhat indicated by an ` 
itself. For 41 with the authors of great music aS well as with the music 
ау reason Kwalwasser’s Test of Music Information and 
sions: (1) hi is interesting. This test 1s divided into three major divi- 
fore, history and biography, (2) instrumentation, and (3) musical 


Ur А 
tion Ey History and Biograp 
аду Such artists as Galli 
violinist diets and John Pow 
€ nation (4) cellists, and (5) 
of famo nality of composers, W 
true-fa] ko! compositions. The 
Positio se items based on the gene 
ns. Illustrations are: 
XT 
16. 1 lon expanded the range of pianism 
23, e became deaf during the last 
38, Che, metronome is associated with the nam 
41 oy a wrote exclusively for the voice 
Symphonic poem was originated by Bach 
n, asks whether 
lowing, (2) strikin 


nvolving the classifica- 
Curci, Louis Graveure, Albert Spalding, 
ell under (1) vocalists, (2) pianists, (3) 
conductors. Another test inquires about 
hile another asks who were the composers 
final test in this division consists of 50 
ral knowledge of composers and com- 


hy there are tests i 


of his life 


years 
e of M. aelzel 


tones on 10 orchestral 


Diy; 
or . 
1 IT, on Instrumentatio! V jth hammers, o 
v ‚ог 


Strur 
(3 in are produced by (1) b Ашы 
ask, ing, e.g., oboe, viola, bassoon, melaphone. The subject is also 
: 1 instruments into (1) string section, (2) 
ssion section. 


tö classify 10 orchestra 
Кун section, (3) brass-win section, and (4) percu e 
^ Struments as violin xylophone, bassoon, celesta, and ophic eide 
lm., Dtioned. There are 415050 true-false items which test information. 
rations are: 


Viola ; 
"à e ап alto horn 
24 e pon has a double reed 
34. The Шо employs a single rec " 
44. Mutes Phonium has two “bells” or “flares 
e are used 4 * ed ins 
only with stringe i. quartets 


eb М : 
ass-viol is usually employe in str 
al form. 


Th s 
MS Ei : к ‘tems on music 
Eran, third section contains 50 true-false it 


es are: 
ational Research and Service, 


University 


H 
It 
өрү tems 
loy; by permission of Bureau of Educ 


298 PROBLEMS OF MEASUREMENT 


4. An overture is played at the end of the opera 
14. Arias are found in symphonies 


24. Arpeggio means a gradual increase in loudness 
34. The cantata is a chloral work with solos eliminated 
44. The concerto is built on the rondo form 


DEN ement 
There are two criticisms of the use of such a test for the measur 


s ucte 
of appreciation. The first, a minor one, finds in the test constr 
Several years ago a lack of 


classification of Galli-Curci 
whether a test of general inf 
of appreciation. Informatio 
will be shown in Army Al 
Bellevue. There is no doubt th: 


ART 
B ny 
The enjoyment of beauty is as old as civilization itself. For m 
years art was thought of a: 


t. The 
ly 
eauty of form and color сап арр 


> кол“ 
1 See Whitford, W. G., An Introduction to Art Education. New York: Apple 
Century-Crofts, Inc., 1929, 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 299 
1. To i 
s a 
одено ар the knowledge of the principles of art and of their 
o everyday experiences: (a) In fine arts, to attain the 
d in the construction of great pictures 
in applied arts, to learn the princi- 


s d 1 n the beautiful wherever found: (a) 
; ‚ Sky, ocean, trees, buildings, clothes, birds in flight, painting, 

{ all kinds; (b) in the various 
h community centers, 


ает 
pts to add beauty to а community throug 
o beautify 


both the interi (c) in various attempts t 
3. To eriors and exteriors of homes. 

Selectin get some experience in and capaci ing beauty : (2) in 

Some "a and grouping fine objec oses and in securing 

ccc de dua the process, an 

ing objects which conform 


ordinati 

"o ur of eye, hand, and idea. : | 
eauty į evelop Кеепег capacities for observation 50 25 to discover 
еашу БЕ nature. Knowledge of what to look for and how to judge 
Ust sti r its absence in the objects which surround us. The teacher 
Er date whatever capacity the child possesses m the way of 
him, ity, initiative, and imagination in dealing with objects around 


d (D) in acquiring some $ 
to art ап! 


t covers adequately more than а 


twi 
Заре Бе ѕееп that hardly any tes 
raction of these objectives: 


MENT IN ART 
jc, has two aspects: (1) the measure- 


surement of achievement. 
of Capacity 
ity of subjec 


MEASURE 


Me 

а; 

Rent поедам in art, as in mus 
capacity, and (2) the mea 


The M easurement 
ts for the learning of 


Ina 
tt а 
ar empting t n 
каик 1 product into а few 
indicate probable 


» t 
ty constructors have trie t 
Succes, ental processes which, if they are done well, i! E 
here. 5 In this undertaking: Three meas j described 
and un the Meier-Seashore Art Judi ment Test (125 pairs of pictures) 
Wit Е revision, the Meier Art- Judgment Е pictures 
Tests pt scoring), (2) the McAdory Art Test, & 
е лг Uundamental Abilities of Visual Art. 
Stew ; Meier-Seashore Art Juos ent Test 2n 
Ut of six years of experimentation 


d its revisions by Meier 
nt revision. The 


300 PROBLEMS OF MEASUREMENT 


i critical 
5 pairs of items are the survivals of some 600 drawings а не 
= Bs imd judgments by experts. The art forms of the ріс f old mas- 
ке iis test of time for they were adapted from the wor Е ай 
e Írom contemporary artists, and from Japanese P^ 60) exemplified 
iis manual, all items (1) were from reputable works, indi Tn taking 
aesthetic principles, and (3) were suitable for testing pup two pictures 
the test, the subject, with the name of the picture 2 sien: Moe od 
before him, indicates his preference by drawing a circle 


Art 

Р ; shore + 
te preference for drawings, d Service 
Y permission of Bureau of Education Research a 
à, Iowa City.) 


Judgment Test. (By 


University of Iow; 


eliability range 
peated. Its validity has 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 301 


0, and 11 and 12. It is also asserted 
quarter (percentiles 76 to 100) are 
tiles 51 to 75 will profit from 


pe for Grades 7 and 8, 9 and 1 
at those who score in the highest 


alm А А 
ee certain of success; those in percen 
ruction and have a chance at an art career, those in percentiles 26 to 


50 ў 

um Л, be able to do the manual part of drawing, and those below the 

B percentile should retake the test. These last will probably not 

Ucceed in art.! 
The authors believe that this test measures aesthetic judgment, the 


m : : е 
Most essential characteristic of artistic production. * Aesthetic judgment 
quality in aesthetic situations 


inr as the capacity for perceiving a 
in Lus apart from formal training." The items are rather permanent 
the үш so that time does not affect them greatly. One critic" believes 
Ee est measures the perception of quality rather than its production. 
оре a useful measure of individual sensitivity to aesthetic 
ae nization of graphic form.’ nder if a test constructed 
uer entirely of the graphic arts can apply to the whole field of art and 
as j her there are not other factors in artistic competence that are just 
qrportant. 
he McAdory Art Te 


ment for measuring art 


Mdgnent, Tt differs from the Meier- Art Judgment in several 
E lars, In the first place the materials out of which the test 1s con- 
c а are of а practical nature; made up of 5a 

ing, architecture, furniture and utensils, as 


li 
ght masses, paintings, and shape and line атап 


n ——whi e 
o еасһ plate, there are four samples—A, B,C, and D which the sub 
t sing order. He receives one point for 


c 
1s to arr: i t plea: 
карен the most P e position voted by expert 


fach sa i 
: : n be in th 
judges mple which he judges t° 0° .. o tud aped 
ЕТ died and the samp es judged again 
ү he whole test has P at plates Were eliminated and the 
re 72 plates, 24 of 


= arte 3 In this revision 
Osition xperts." In this r ) jl told there а 
lions were changed in four others. A pepe id the judgments 


ате in color. By means of record she 
* registered 6 шау ѕ тапу аѕ 30 students at one 


st is another instru 


Sittin, 
from .79 to .93 depending on the 


The 

E j I 

u ality of the test AS «= studied by relating 18 
ion which is used. Its validity "à ith the Christensen 


Sco А 
Tes : Jation Wl 
to other art tests. For example" ан» 


1 
See 
Examiner’ 8-9 193 it., Item 1327 
aun er's Manual, pP- , nent Y ed: p. cil, bs 
eie A i em Mer А 
argaret Mc. £ plications ze, 
| Й k: Bureat ad uction and Valida- 


Со, 27 e M Y 

lumps cAdory Art Test. New Yor t, The Const? 

t : ‚ Margare i 

ion о ia University, 1933; and McA = d Mlications, Teachers College, Columbia 
eau 


of а 
Universit Art Test. New York: Bu" 
y, 1929, 


Fic. 20. Plates 8 and 19 of the McAdory Art Test. (By permission of Margaret McAdory Siceloff.) 


SLAV тупмуй ANY Slav ANIA JO LNSIAGGHOSVINW 


£0£ 


INSWGHOSVIW JO SW31HOud ZOE 


304 PROBLEMS OF MEASUREMENT Е 
t, .27, an 
Art Test was .63, with the Meier-Seashore [os = dus a low 
i Үй Тһе author ^e" 
i i dgment Test, .58. ulon 
with the Levering Art Ju “ey rian insi 
i i ier-Seashore test by saying tOr 
correlation with the Meier: { ‹ Se, r 
: he particular objects judged. t edlen 
is dependent upon t j merenti t 
i lations have been comp ciim 
knows, few, if any, corre ed ue ciiam 
i isti ations. Norms have been esta md tom 
pe а E 000 or 6,000 students in the New York area and ien "lation 
ex 3 to college and art schools. As with other art tests, its c 
rade c 
with intelligence is low (.15). Кете 
According to the author, the uses of the tests are ` from others 
ducational device it distinguishes those with artistic ability n well as 
s do not possess it. It can thus select pupils for art classe d It may 
help the teacher decide whether art work should be continu spective 
be used when advising with students concerning their P máy have 
occupations which require ability in art. In the third place, i ione fini 
consumer use in helping the ordinary individual to know 


: ; jects for 
dependence to put on his own judgment in selecting art obje 
daily use. 


The value of this test is ] 
ingin the practical materi 
advantage of being a gro 
teachers’ ratings of art 


" hang- 
essened because styles are ШШЕ, the 
als of which its plates are composed. 


1, Шеп, 
ther art and intelligence tests. In genera 


1 grades 
in Fundamental Abilities of Visual dmi е. 
3-12, are divided into three parts, as shown in the accompany} 


Part Time, minute? 
ar 
П 10 
1. Recognition of К жаша ЛИГЕ 20 
2. Originality of line o7 MM ал 
II 


- Observation of light and shade 
- Knowledge of Subject-matter у 


3 

4 

5. Visual memory of proportion 
ш 

6 

1 

8 

9 


ocabulary. . . 


20 


d in fo" 
5 from the same object represente 
erenz, Alfred S., Lewere; 


si AE 
"P" иа! 
p nz Tests in Fundamental Abilities of T 
nia Test Bureau, Los Angeles, Calif. Items by permission. 


MEASUREMENT OF FINE 


di 

= proportions that one 

subject curves, masses, etc., each 

inaha aie select the best. Test 

things [pee manner through w 

the же} is is a fine test of origina 
ject marks with an X thos 


1 


uch obj 
bjects as cubes, spheres, CY 


which is the b 


hich the subject is to 


e areas where 


ARTS AND MANUAL ARTS 305 


est. There are cups, friezes 
with four proportions from which the 
2 consists of 10 sets of dots arranged 
draw interesting 
tion. In Test 3 (Fig. 21 
there should be shade. 
and a house are the 


ity in imagina 


linder and cup, 


1 


2 — 
— = 
—_ Ee 
SS = 
iia -— 
Fro. 21. T; = 8 
htz еле ы Tests in Fundamental Abilities of V isual Arts. Observation of 
апа shade, 


Pict 
b = The directions, 
Pupils, are as follows: 


his į 
sh е to show how well you Ч 
k e ten drawings below mar 
object, should be a shade or a shi 
s in No. бапа No. 7 are open. 


in 


Tes а 

the v. test of the vocab 
being 5 ors of pictures. Test 5is 
AL seen must have its outline 

On 

Yell RS top of the color char 
cha uh. blue, and violet- 
= formed, writes down the st@™ 


he ret: | 
on ү liability of the test i$ indic 
9, on 


Pupils in grades ~ 


to be read alou 


nderstan 
k with ar 
adow. The ligh 
шату of ™ 
drawn. 


Шегепё types of perspective 
ta 


id by the examiner and silently 


pret problems in light and 
e or surface where you 
ng from the left. Only 


d and inter! 
a (X) each plac 
t is comi 


cesses, drawing and of 
large vase which after 
and 8 have to do 
r recognition. 


aterials, рї 
jcture of а 


ар 
Tests 


es 
re 51Х 


nt of .87 computed 


ya coefficie 
for reliability. Its 


ated b 
fair figure 


ly 4 


PROBLEMS OF MEASUREMENT 
306 


des 

: ter gra 

j ing i s with semes dy, 45 

н studied by correlating its score her study, 

[лан эы де а. this figure was -40 (manual). In anot 

pe im the manua] the coeffi 
estimate was .63. Norms 


t 

h to ar 
t will depend on Whether the approac ills 07 
shoul come throy the асс pa 
udy of integrated who i f регар ч 

or painting, Anot arises as to whether measures 0 a 


he V 
- Reviewers agree pem ates 
i °sophy of the teacher who co 


asuremens of A chievement 
Only one test will be us 
e 


t 
jevem ^ 
ed to illustrate the measu rement of "and асве, 
Th er Art Ability т, St,’ measures both capacity an nsiste 1 
ment. Wh many of the tests of art t us far described have co oy 
Judging or, Most, finishj g 
of actual i 


r£ 
ists 18 [be 
i а drawing the present test M nside К. 
Problems or dnd ART from ав кши. drawing f ed 
S of the test After the first test, which consists of dr conce!” 
© design, the major problems are 


€ 

figu e 

5. The Subject must nt o. ke 

Сир in а Saucer: 4. compost "RN 

trees, а Cottage d ath; and dee "DES we din Dog- motio” 

The aes a raded both fo Position and for expression of € by 

e mr furnis Cales at three levels of quality—10, 6, een reli 

e Eo Which the Tàwings can be more accurately rated. ho a 

bility of the test is reported ag SSiN tha eae of 83 subjects W. aver га 

greatly in ability. The test? alidity h b tudied. The T 
Score on the test fo $ аз been ж 


Е rart teac 
median for art ma; 


61. rt 
S ET 

ors in the «tS Was 12 › for поп-агі га non? 
H 1 the Juni T class in colle е was 95, or 
i 52.? Norms compute em ба of grades. У і 

i ms ith do. ^ c n Tough the twelfth, medians are able- 
along wi grees of ability as shown ii the абббитралуше t ub 

! Knauber, Alma Jordan, Th 

lished by the author. © Knaubey Ant 


innati, Ohi?’ 
Ability Test, Cincinnati 
? From the manua], 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 307 
4 


Grade | Ver 
y low Low Average E i 
Y о 3; xcept 
E norm ability ability ability аршу art dus 
й 
" 0-10 11-15 16-28 29-39 40-170 
12 
0-18 29-42 43-69 70-89 90-170 


Simi 
a мре are furnished at the college level. The author claims 
n к measures largely native ability. On the surface it is a 
undoubted competency gained in taking courses in art. The scores 
edly reflect partly native ability, partly interest, and partly the 


ade, 
u e 2 
quacy of training received. 


MANUAL ARTS 

tizens of the United States who are 
ectly or indirectly in activities that 
dge of machines. Thirty per cent of 


M 
gainful than 40 per cent of the ci 
d y employed are working dir 


ma = : 
Our des Some facility with or knowle 
Pulation are skilled workmen, many of whom need to understand 

ate parts of machines. Add to these 


e : 

skilled ical processes and to manipul 
Skilled ави а goodly number of machine tenders who are semi- 
Area " "urthermore, there are a rich variety of occupations in this 
aching УЕ in complexity all the way from changing spools on a 
Predicts. to building a cabinet. For these reasons, the measurement and 
tance Ug of mechanical ability or aptitude is of the greatest impor- 
irect n the third place, there is an inclination in some quarters to 
rt Students who fail in academic subjects into the courses in manual 
de or ability. While there 


wi ' . . 
thout regard to their mechanical aptitu 
mic aptitude and mechanical 


n 
i n? small correlation between acade | i 
арны, ©, there is ample evidence to show that those. low in academic 
er 5 ате not necessarily high in mechanical aptitude. just as 5 
have чес, individuals who enter courses in manual arts shoul 
ho e for them. 
men, ОЁ mechanical ability are used both to measuri 
5 the and to indicate the presence of mechanical aptitude. More than 
ase with tests in other fields, prediction is an important function 
foretell the probable success 


9f th, c 
et red 
Of а ., St. These instruments of prediction r i 

s but also in the occupation 


whi ; uident not only in the manual art 
€ is most likely to enter. 

s, School's {өс A is to acqu t the students with the breadth 

hich fills such a large place in our civiliza- 

and descriptions on the one 


reading, ) 
ome actual occupation on the other. 


is о 


e school achieve- 


ain 

Б. Sign; 
wa р. cance of this area W! 
апд a, 15 15 possible through trips» 
through participation 10 $ 


308 PROBLEMS OF MEASUREMENT 


. 5 1 this 
Well-planned courses in industrial and practical arts strive to рат 
rather 

p in manual arts in the elementary School are apt үсе” 
general in nature, with less emphasis on precision in y cenam an 
and more upon a general understanding of the part that ^s course fre- 
industrial arts play in our civilization. The materials of ^ nae, бё 
quently grow out of the problems being faced daily byt i ү the chil 
that community. Their main purpose is exploratory in tha cupations 
explores his interests, aptitudes, and general fitness for ме. makes 
which require the coordination of mind and hand. Such a one uired tO 
a table or a lampstand appreciates more keenly the work c»: jg more 
construct an acceptable commercial object and consequent A These 
apt to acquire a new respect for labor and the laboring mar iderable 
courses in the manual arts, then, are characterized by a cons hool i$ 
variety because they vary with the environment in which the 5С 

laced. nd nong 
Р In the junior high school there is also а wide differentiation manua 
Courses. Boys' aptitudes are provided for in such courses 45 cabinet 
training, plumbing, electricity, woodworking, metalworking, uschol 
making, etc., while those of girls are met in domestic science, - require 
arts, prenursing, bookcraft, or home decoration. These courses 


pe for? 
more exactness in the objects constructed and more workmanlike fous, 
in the processes used. Because there is such a rich variety o mation 
sts have been widely used. Tests of infor! use Í 
ruct, but standardized tests or scales for dee? 
€ objects made in these courses are few i 


OBJECTIVES IN THE TEACHING ОР M 
As we have often sa 


be clearly defined bef 


ANUAL ARTS 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 309 


3. Fi А 
вна to (a) furnish an opportunity to develop special aptitudes 
Sos w ет а need for further courses by pointing out the part that 
offer реч se ie м play in successful industrial work, and (c) 
oxi: CREAR Я 
to E i id such as printing to those who are soon leaving school 
an А "M 
às yet М of these outcomes of instruction 1n the manual arts have not 
tion кон, еп measured. Easiest of all to measure is the amount of informa- 
a 7 тек Interest in mechanical activities is well reflected in the 
aptitud our interest inventories as developed in Chap. 16. Measures of 
e also will be described in the course of the present chapter. 


TESTS 
ment of industri , ontribute something to the measure- 
and the t ustrial arts. Especially is this true of the McAdory Art Test 
Mong d Tests in the Fundamental Abilities of Visual Arts. 
Vash-V e tests of woodworking and mechanical drawing, only the 
Uzee Pie Duzee Industrial Arts Tests will be described. The Nash-Van 
ndustrial Arts Tests! are divided into two tests: 


Test т. Woodwork 
ан А. Technic 
р cale B. Performance 
est II. Mechanical drawing 
Part I. Information 
Part II. Performance 
e items. Multiple-choice 


1 Scale 
tte A of T i d of true-fals 
t with one ph aren | "» processes and methods used in wood- 
se the care and use of basic han and machine tools, etc. Test I also 
Useq ; A Brams to test knowledge & understanding of common joints 
test Ш woodwork. and incomplete drawings of a simple wood block to 
a understanding of & guns as Me c of views, 
$ ОГ re 5 in sho} ra wings; В . 
ee Ae a x i piece d wood and the proper tools with 
be perío orking 
«plane, square, and true” (1) 


ings plane the 
A booklet is 


N 
early all the tests in fine arts с 


al and related information 


nd 


К eet. 
tray ie n processes are to 
citate (gj e subject has for exa 
amp an edge, (3) an end. Ё 
Iu, pfe 2све, Я th. 
“ished Straight and true an mortise smooth 
which aids the tester in 5C n Ps up Ot 


ny he IL M r ism 
© ‚ Mechanical Drawing, Tn pin 
ve 


rocesses which an 
y used. Аз in the 
Itiple-choice tests 


Dre, tation | 
lbs. e HE DNE schools pro 
: 1 i mu 

1 8 test, Part I consists 0 completion and 
Milwaukee- Item by permission. 


з рсе я 
Мап, Publishing Сотрапу, 
› Test I, Woodwork, Scale А- 


310 PROBLEMS OF MEASUREMENT 


of information relative to mechanical drawing. It also has a test of uw 
pretation of the conventions of drawings and machine drawings. Par » 
contains tests of dimensioning, geometrical constructions, making 
working drawing, lettering, and orthographic drawing (Fig. 22). Без 

The reliability of the tests varies from .61 to .94 with a median а 10 
87 for the test as a whole. Its norms are unique indeed, for рона 
the usual median or percentile for cach grade, the norms of X t л 
best score are given for the number of minutes the course has 


2. Draw an auxiliary view of the surface A-B. 


The plane cuts the pyramid at an angle of 45 
degrees to H. 


Fre. 22. i 3 т ‚| Arts 
D Section D, Orthographic Drawing. Nash-Van Duzee Industrial 


MEASURE 
SUREMENT OF FINE ARTS AND MANUAL ARTS 311 


A stud: ; 
mechanica Me carried on to determine a list of the practical jobs of 
бв ufa me 2 ordinarily done around the home. First 382 ho и 
^ wete = ui nature were reduced to 130 which “ers ematical 
was sent кл 2 кыже Then description of these jobs 

omes in Д estionnaire to «100 mature рео 

ad саю —— west," who were asked to check he re bs xs 
à number of : perform. The investigation also sent а questionnaire A 
it home те ^ hools to discover what was being taught in their ana 
bs ware de anics; 75 schools replied. Altogether 72 cee eis 
Mechanics и ected (1) because they were widely used in home and 
best the ае. and (2) because they stood high in social utility. 
orms, wi es through an experimental and a final edition. Two 
centages of jobs in each, were constructed. A composite table of per- 
Brades 7 t accomplishment shows the percentages of achievement in 
emphasized 9 in each of 10 schools. The reliability of the test is not 
with the St but it correlates .44 with the Otis Intelligence test, .26 
course in tenquist Assembly Test, and .64 with teachers’ СТЕ А а 
tom the а es aes home mechanics.’ The two following examples 
the man, est illustrate the type of activities which compose the test and 
all E in which measurement was made. The directions state that 
ures are to be rearranged in the right order according to their 


Dump, ers. 


8 
To 
1. Ж 3 joint with tinner's rivets 
Eu the rivets 
4g e rivets 
21 py Pot the rivets t- 3f y St ) 
assemble a radio set 
й сема, set according 
- Deci Bis the necessary parts an 
A eno circuit 
ҮШ ec er on panel 
"Lay ie panel and fasten to 
panel and baseboard 


e more а 


to circuit diagram 
d supplies 


and baseboard 
baseboard 


3t 2636352 


dure than as а 
hanical 


QR 


his Р 
es u Pow ls introduced her s a sample of proce 
К in ih. dardized test. In the first place; the samples of mec 
thet, 3: 5 Middle West might not be the same as those in 
tee Count ar West. Nor could the norms be applied in other sec 
ay Con, ту without modification. On the other hand, the procedure in 
9me ho diee which discovers what is actually being done in the 
R ч Newki then checks this outcome with the school procedure is sound. 
Чисано irk, Louis V., Validating and Testing Home Mechanics Content. Studies in 
ity of Iowa, 1930-1932. Items by permission. 


3 


n 
» Vol. 6, No. 4. University 


tions of 


312 PROBLEMS OF MEASUREMENT 


Home Economics 


" ivided into 
The objectives of instruction in home economics depen 
the immediate and the more remote. Immediate ue us of foods. 
1. To develop skill in the selection, preparation, an n derstanding 
This involves the acquisition of (a) the knowledge and ur lication о 
of the facts and principles of nutrition, as well as (b) iden the table. 
these facts and principles to the actual preparation of foo he selection 
2. To develop efficiency in exercising good judgment in t dation 0 
and making of clothing. This efficiency depends upon the | th, about 
information about the characteristics of different kinds of red effect 
the use of patterns in cutting out garments, and the pi «d in 9 
upon the person of different kinds and colors of cloth arrang 
riety of ways in garments, etc. cently 
S do ocean? the Батке ЫН ев which make for an ar ae 
run household. In this division there are problems of the proper ant 
ment of time and money, of good social relations within the hou 


as well as of house planning, of house furnishing, and of house ппс 
4. To understand and to apply to the care of the home the best I 

ples of aesthetics, hygiene, and sanitation. indivi 
The More Remote Objectives Aim to develop within each c rt an 

those attitudes which will result in consideration for the com ommon 

convenience of others as well as п а willingness to serve for the С 

good of the whole family. 


Measurement in H. ome Economics 
Most easily m 


about foods, clothing, a 


" 
nom 
Several branches of home ka " ple 
lethe a at Purdue University.! The accompanyi 
1515 the titles of four tests, all suitable for grades 7 and 8. 
1 State High Sch 


: ool Testin ri iversity, Lafayette, 
by permission. E Service, Purdue Univer y 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 313 


Time, minutes 


Test 
1. Assisting with Clothing Problems......... 0j 28 
2. Helping with the Housekeeping... nÀ 28 
3. Helping with Food in the Ное. venen n rnm ваа 28 
4. Assisting with Care and Play of Ghildren.s. esr 28 


е of study of the state of Indiana, 
se tests have no published norms 


because they cover each unit 
s for tests in home 


ү ; 

рый» tests are based on the cours 

or d common principles. The 

ле ш! but deserve mention 

econ 5 ly. They furnish highly suggestive technique 
omics for grades 7 and 8. 


M Tests of Home Economics: High School 
is di €asurement in the field of home economics at the high school level 
ER into two parts: 
ii a of information in t 
2 А " x 
and Rating scales (a) of habits and procedures used in preparing food, 
n -l of the foods themselves. 
the te e tests of information and un 
ion prepared by a group of workers 
Panying table lists the tests- 


he areas of food, clothing, and home 


derstanding we must tum again to 
for the state of Indiana. The 


Time, minutes 


. Child Development 
. Home Care of the Sick... 
. Housing the Family... 77777 


мостом к 
ч 
o 
о 
A 
@ 
- 
= 
= 
5 
оа 
os 
o 
3 
У 
= 


r the areas well, and test 


oth ; A di They are not, however, 
st inf derstanding. y SETS 
ormation and wt o norms or computed reliabilities 


апдата; 

ang АТФлей tests because they have ^ 1 jabi 
е к still in the ein P stage. À more detailed wr tae of 
State, hese tests will give an idea of the soundness of the above 


“ments, 
election and Prep 


ted, cove 


Th 
for ,'5€ tests are carefully construc 


aration, contains 


lye.” test titled Е d 5 : 1 
5 items to = oe " voe 0 (true or false), multiple choice, and 
| — td uched in the form of a problem or 


Sit. 118. Someti : со 
; etimes the items are 5 es 
Чоп, ag for example, the presentation of a menu which ig һе 
à xamp'e, contains а variety of color, is a fuel- 
e illustration of a 


d by checking whether it co 
Закр, 6а], contains little starch, etc. 
ng problem: 


314 PROBLEMS OF MEASUREMENT | 
тоир in 
Place in the blanks at the right of Column II the letter of the pie. п. 
Column I that best identifies the item in Column II or the function 


й 
e items тау 

The first question is done correctly to show you how to proceed. Som 

be used twice and some not at all. 


Food Groups—Column I Items—Column II 


a 85a 
i fish — — 95 
a) protein foods 85a. Meat, poultry, and 8 
ч 85. Codliver oil — gó 
(b) Vitamin A 86. Calcium, iodine 


n 
87. Sugars and starches RECON 


88 

minerals 88. Butter, shortening, bacon fryings 89 

" Vitamin C 89. Found in green and yellow vegetables — v 

(e) carbohydrates 90. Bread, potatoes, cereals i 
(f) vitamins 91. Found in sunshine 

(g) fats Functions—Column II 92 

92. Prevents “night blindness” —— "ga 

(H) Vitamin D 93. Repairs body tissues — o 

94. Prevents rickets in children = 

(i) sugar 95. Provides energy quickest in the body 06 

(7) starchy foods 96. Necessary for healthy teeth and gums “tof 

(k) Vitamin B, 97. Necessary for growth iis. 


Many other problem 
in preparation for th 
by high school stude 


o do 
ѕ are included, such as what Mary pm тай 
е family breakfast, what foods should d wh 
nts, why Edith's cake fell in the center, ап al 
Correct practices Barbara observed at a dinner party. In gener?» 
list of answers is furnished, the student checks the correct SES dl for 
In the ratings of habits and of foods, the Minnesota Check 3 rating 
Food Preparation and Serving by Clara М. Brown! consists of 1 


score 
1 2 3 4 5 E 
1. Groom- Untidy; hands or Reasonably well Immaculately 
ing nails dirty; dress groomed; dress clean; dress and 
soiled or inappro- suitable; apron apron fresh, un- 
priate, no apron; soiled or wrinkled; wrinkled and ap- 
hair in disorder hair neat but not propriate; hair 
and unconfined held in place held in place by (D 
band or covering 
10. Setting| Wrong dishes, Dishes, silver, and Dishes, silver, ап 
of table | silver, or table table cover suit- table cover suit- 
cover used or аг- able and arranged able and correctly 
ranged incorrectly; correctly; center- arranged; decore" 
table looks crowded piece lacking or tions attractive (10) 
inappropriate 


! University of Minnesota Press, Minneapolis. Items by permission. 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 315 


scal А s 

of uorum of the 13 traits rated is accurately described at three levels 

Doe ievement. Two of the scales are shown in the preceding table 

For e checks each of the 13 scales at the point which describes the 
di : А pap 

stated in th iiie гер A points. The value of using a check list is clearly 


Value of Using A Check List 
e results of using à check list are 
mental studies. 
when goals are clearly defined 
a regarding them. Use 
learly what desirable 


E following statements of th 
1 pex the findings of expert 
t a ащ 1 proceeds more rapidly 
ef ES hen the learner has only a уа; 
st e check list enables students to see c 
andards are. 
ile, Pencil-a nd-paper tests, no matter how high a degree 
ee are nol valid measures 0] @ person's ability to do certain 
Paper he correlation between knowledge as recorded in pencil-and- 
ob 3 objective tests and the abilities listed in the check list appears 
И considerably below .50. Since this is true, it is essential thet 
ae of appearance, personal habits, and work abilities be 
3 1 these are regarded as important goals. 
of th roviding descriptions of low and average achievement as well as 
tou € high level increases accuracy of rating and enables students 
4 Nderstand wherein they fail to reach the standard. 
ап > Objective self-evaluation tends to accelerate the rale of learning; 
“ЛЧ the use of such devices 25 the Minnesota Check List permits 


Indiv; 3 ig UH 
dividuals to judge their own achievements and limitations. 


gue ide 


of reliability 


fd T€ are no norms, Or published reliability, or апу correlations of 
а with other criteria. 
Score ant rating instrument 
9f Cla ards, revised edition? which was constr 
qualit, a M. Brown. These cards contain та \ 
the o Y of 57 foods. The precise wording of the rating 
food Jectivity of scoring. "i 
Can, oe bacon, coffee, eggs 
Constr (four kinds), soufilé, tea, & 
the Red to rate the success of studet 
oratory. One example jg shown 1n th 


Minneapolis. B п 
Testing Service, 


here describ 


fruit сир, 
s. These cards are 


dents in actually preparing food in 
e table on page 316. 


1 
Uni vci 
2 n Versi: s y permission. 
by iid d of Minnesota Press ^ «nal Princeton, N.J. Item 
Permiscio ive Test Division, Educath 
n. 


316 PROBLEMS OF MEASUREMENT 


Ice Cream 


1 5 3 Score 


СОЛОР ея 1. Muddy or pale 


Clear and uniform 
Consistency...|2. Too hard or runny 


Just firm enough to hold 
Shape 


"Textüte. «sue 3. Coarse, granular, or Smooth, velvety, compact 
fluffy 

FOR ЖИНИ 4. Flat, insipid, or too Delicate yet definite; well- 
highly flavored blended 


MECHANICAL APTITUDE AND ABILITY 

Thus far we have conside 

arts, mechanical arts, and h 

be devoted to a consi 
tude or ability. 


Я fine 
red achievement tests in the fields of wi 
ome economics. The rest of the chap I apti- 
deration of the measurement of mechanica 


Uses or Tests or MECHANICAL ABILITY 


à ists O 
test of mechanical ability, which cons! 
the control of the directi 


za 
а м r 
of a pointer by two screws which Wo 
; Correlates .57 зү 


Е king: 
à "97 with machine operating, .59 with toolma 
and .62 with turning (lathe work).! 0 


PROCEDURES Оѕер т 


There are three Procedures which тау be used to test mech 
ability: (1) analyze the me 7 А 


test them, (2) construct. tests of j 


N TESTING i 


W. york 
1 Bingham, Walter Van Dyke, A blitudes and A btitude Testing, p. 135. New 
Harper & Brothers. 1937. 


ME E Y 
ASUREMENT OF FINE ARTS AND MANUAL ARTS 317 


Analysis of Processes into Elements 


Just 

into pi е hase чүч of musical ability may divide this abilit, 

ability ees ensity, time, rhythm, etc., in like manner Ferien 

3) deme К вапно! into (1) reaction time, (2) agility and strength 

Ur these a EON (4) steadiness, (5) manual rhythm, etc. To eag- 

structed р ilities efficient measuring instruments have been con- 
. Reaction time has been measured by a chronoscope in thou- 


5 
үз of a second. 
ction time is the elapsed time b 


e 
is iere d of some defined act. 
al sits with one hand on a telegraph key which he pushes down 


Whene : 
ver a light is flashed. The signal may be a flash of light, a sound 
in, etc. The reaction time depends on cen 


а ta. 
med touch, a smell, pat 
course oe set of the individual, the intensity of the stimulus, and of 
Same tu type of individual. When such measures are made on the 
depend и vidual we find large differences in the reaction time which 
ora tub the modality employed. For example, the reaction time 
1.082 oe on the hand averages about 0.120 second, while it takes 
mente to respond to a bitter taste. — 

Agility or simple measures of simple abilities are now considered. 
mg, Gane children has been measured by their capacity in jump- 
easured m: balls, and climbing ladders. Manual dexterity has been 
as rapid] y simple tapping in which an individual strikes a brass board 
e dors as possible with а stylus which is in circuit with a counter. 
oving er registers each tap. Steadiness 15 measured by а subjects 
touches & brass stylus between two converging brass plates until he 
Size without of them or else by P lus into holes graduated in 
by hay; out touching the sides of t hythm has been measured 
Peated 05 an individual listen to @ r notes which is re- 
uence C several times. The test comes in keeping time with the se- 

" p; m Y pressing а telegraph key- 
аву do is no question about the ас 
рге Жун what they purport to do, 
d nce "y ordinary mechanical perfo 
se sim ed. These latter activities are M 
pler functions in a great variety of co 


Tests of Information 


П th 
а es i 
" Bag € tests many types of in 
o esses are sampled. The assu 


m ау Ws i 
achin * good mechanical ability | 
hem, W? 


etween the giving of a signal and 
Under the simplest conditions an 


curacy of these measurements. 
put they do not correlate with 
rmances with which the school 
uch more complex and include 
mbinations. 


about Mechanical Ability 

formation about mechanical devices 
mption is that those individuals 

ill be continually examining the 

] read accounts of new machines 


es wh; 
Which are around 


318 PROBLEMS OF MEASUREMENT 


; аїе 
in such magazines as Popular Mechanics, and thus will TO 
mechanical information. On the other hand, those possessi Е; a 
mechanical ability will not examine machines nor will they care к 
about them and so they will not accumulate information on ae st is 
and their processes. Unfortunately for the use of this pcm à 
substantially correlated with intelligence and hence does not a c 
unique measure of mechanical ability. Here are a few examples fr 
Detroit Mechanical Aptitudes Examination for Girls:! 


14. Solder will stick best to 1 glass 2 lead 3 leather 4 yt 
20. Glass is usually cut with a 1 chisel 2 files — 3 scissors manifol 
23. A spark plugisinthe 1 commutator 2 cylinderhead 3! 

4 piston. 


eec air with £25 
27. A carburetor 1 explodes gas 2 measures gas 3 mixes ат т switch. 
+ An electric doorbell requires 1 current 2fuse 3 plug 
In addition, 
ability have one 
information for 


ical 
: апіса 
practically all paper-and-pencil tests of paure 
or more sections which are dependent upon me 
their correct answers. 


Mechanical Assembly and Performance Tests 


^ Р " uttin£ 

Mechanical assembly tests, as their name implies, consist e»; gadg- 
together in the correct manner parts of disassembled mechanic of such 
ets. Stenquist’s original mechanical assembly test was made up 


an 
objects as a bicycle bell, a chain with split links, a small door реви ai 
а Mousetrap. The disassembled parts were to be reassembled by which 
of a screwdriver. An assembly test was also constructed by Toops girls 
contained items lying more nearly in the usual environment pc 
Such problems as the Stringing of beads, cross-stitching, tape 5 em 
card Wrapping, and making a trunk tag were used. All these gss i 
tests demanded a great variety of psychological processes 120 like 
p PAOD Steadiness, and manipulation. Sirte they were mort ут 
real-life situations ; á 


cribe 
aptitude. Many of the tests later to be des 


vised Minnesota Paper Fori ome?" 
titude Test for Men an 


gi- 
р on: 
+ Нет by permission of Public School Publishing Company, Blooming" 


a 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 319 


(3) Apti А 
oe Tests for Occupations, and (4) the: Differential Aptitude 


Performance Tests 
Mechanical Assembly Test is of the 


first j 
E uam The builders of this test first made a thorough canvass 
pee i 3 tests.! Among the many tests investigated, the Stenquist 
Particular j Assembly Test proved satisfactory save in one important 
E r, it had a low reliability. This rather short test was lengthened. 
ms were tried out and the successful ones embodied in the test 


unti 
il there were three boxes—A, B, and C—each of which contained 
u will note that this test con- 


11 

oe to be reassembled (Fig. 23). You wi 

а pair i gadgets as a large paper clip, an ordinary lock, a safety razor, 

and Rs pliers, scissors, a. bicycle bell, а die holder, an expansion nut, 

s ws other mechanical objects to be put together. у 

ment E other tests, the most difficult problem of all was the establish- 
oma of the test's validity. In so many cases the criterion against which 

Th 16азше the validity of a test is no more sound than the test itself. 

Ester present instance а criterion was desired which had in it the 

Ce of mechanical ability. The с ted was the 


quali riterion finally selected 
ity of the mechanical work actually produced in a junior high school 
t was mà ately 


, Of this group, the Minnesota 


Eo mechanical arts. Every effor de to measure ассша = 
acts of the class’s workmanship. In the first place direct observation 
range ection were made as to whether, for example, к ye 
ашы in printing, whether the working lines showed 1n mne 
ipped » whether there were loose wires 1n electrical wiring рг 5 
тед ү woodworking. In the second place, actual pini que 
сайре Pplied whenever possible. Rulers were used to pedi "à pos 
e м. to measure dimensions in mechanical drawing, Es ^ Pu 
We unded corners, steel square to locate rivets, and a graduate 


е Tn the third place 
Соц measure the flatness of boards. In the | ў 
T " řucted with graduated samples of increasing fineness of quality. 
anoth Was thus one scale for rating the soldering of biscuit dard 

ег for judging the splices of wire in electricity; and another tor 


jude: 
us lettering. 

her to sults of these three criteria were 
авар xd a quality criterion which к. 
Stoo ich all tests could be measured. ] that tht 
М Dae above the others in their corre ; Ё ith this criterion. 
the prota Mechani Е Test correlated « 

; anical Assembly 16 : 

innesota Spatial Relations Test, 53 and the Minnesota Paper 
Minnesota Mechanical Ability Tesis. Minneapolis: 


combined in t 
s reliable and 


U Pate 
Diversity a Donald G., et al., 
of Minnesota Press, 19 


PROBLEMS OF MEASUREMENT 
320 


; hanical 
e results of an information test of mec imn 
Form Board, .52. dcs pA criterion of quality only the La b de 
ioni erum te substantially (.52 to .65). It was le ane 
ae inte quee was correlated with intelligence 1 ability. 
MIA ah significance to the measurement of mechanica 
adde 


ota Mechani 


Marietta Ар 
aterson.) 


i E n 3ox€* 

' = : rm, P" nd 
Fic. 23, Materials from Minnes cal As-embly Test, short fel m 
I and II, (By permission of the 
Professor Donald G. P 


h10; 
[6 1 а 
paratus ompany, Marietta, 


u 
the attery. Three of them trac pt А 
and dotting, have а large manual-dexterity element. Tracing tica e 
drawing lines through smal] Openings placed in a series of vertic® do 
about 14 inch apart. Tapping consi 


А enc! 
Sts simply of putting three p 


MEASUREMENT ч 
UREMENT OF FINE ARTS AND MANUAL ARTS 321 


аз fast as possible i А А 
бане e in а series of circles, all of equal size and t " 
Circle B m = п otting test the subject places one dot * ada eee 
second grou и ed line of circles occurring at irregular intervals. The 
pursuit auk gus tests—consisting of copying, location, blocks ~~ 
sists of traci ире closely related to intelligence. The copying teat con- 
figure. The ng out on dots arranged in rectangular order a simpl 
Bioper- dut T of beginning is indicated with a circle around ihe 
the deberi a test of location consists of recognizing on a smaller area 
set of tothe etters placed in a much larger area. In the blocks test a 
he Бош rawn all the same size are piled up in a variety of ways. 
flow densis m is to count by direct visual inspection and visual projection 
er of blocks which touch a marked block. In the pursuit test 
› 


Tig 
. 24, Я 
irae ee (By permission of 
Y of Southern California.) 


through a maze of other 


the 

бе 

line Dad follow a single wandering line 
l башы, correct destination (Fig 24). 
Atter p have been computed for ages 
: orms are based upon 1,000 males an 


Батар PAY and validity of the test are 


dults. These 
aged 16 and 
succeeding 


10 to 16 and for a 
d 1,000 females, 
discussed in the 


uch wide application? The charac- 
ation, separateness of the subtests, 
it. The test can be ad- 
ore and its norms are 
for the subtests taken 


а: 
Shown oY and for the battery 25 ® whole. The reporte 
E usa the table at the top of P287 322. " 
pith u not only is it possible to correlate scores on the test as a whole 
Matig cess in any o ccupation, but any single test's score, or any com- 
2 of scores weighted in any manner, may be likewise correlated. 


What 
a tistics characteristics have led to § 
nd of brevity, ease of admi 


322 PROBLEMS OF MEASUREMENT 


Test Reliability 

1. ТУЙШ > оза жайын шын даш. ‚80 
2. Tapping. 78 
3. Dotting.. .74 
4. Copying... .86 
9. у... ж НН so onte d > 
ОРОКЕ ое tr 80 
DEUSE сонсо алана 76 

His SOON, Le carmi sped 90 


One study (Harrell and Faubion, 1940),! for example, concluded me 
the tracing subtest predicted more accurately the elements of meum. 
than the test as a whole. It is thus possible to use optimum weights 
each prediction. 2 
At Hunter College, it was demonstrated that a combination of. ae 
suit, tracing, and dotting predicted success in typing. Lawshe per 
out that a multiple А with optimum weighting correlated .46 in select ; 5 
radio-assembly operators, while the total test's correlation was 42. to 
correlations with success records in occupations have been keys both a 
its use and to its validity. It has been correlated with such mechani 
occupations as aviation mechanics, aircraft inspectors, machine е 
tool-maker apprentices, gun wrapping, and mechanical drawing; | 


. H es 
the correlations with these criterion scores have rarely been as ne 
foe they have shown their worth in combination with other predic 
actors. 


ing as well as in projects of cons 

Showed a significant difference е емер 
promising and most unpromising." Me elec 
pils aged 12 to 15 developed a project E jish 
rrelation between test scores and accom? 


MacQuarrie’s detiene in time to complete the project; -12 


aptitud. d oordin?" est 
speed of finger movement, a; E ecd es sua tant а s ТІ Us 


nd ability to visualize space-"" 
1 Harrell, Willard, and Richard 
Mechanics,” Journal 


j 
5 Stoy, E. G., “Additional Tests f " . 3 » pers? 
Journal (1928) 6:361-366, 5 for Mechanical Drawing Aptitude: Е 
‘Horning, S. D., and Ruth S, Leonard i ical Ability 
eat » 5 $ , “Testing Mechanical А! 
MacQuarrie Test," Industrial Arts Мағаз; 5 
ен ml, agazine (1926) 15:348-350. 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 323 


itsel i 
E nd of a set of tests to measure specific aptitudes. It undoubt- 
Eid hard € manipulative skills which involve the dexterity of finger 
E ws А, cuity of vision, the control of muscles, and the perception 
the анн is little in the test concerned with the understanding of 
BE uc me principles of mechanics or with familiarity with the 
[ cb s. Like other tests which predict, this test is plagued with 
lin the ions. How can real prediction be much better than chance 
aa ыл instrument correlates .45 with the criterion of 
just 11 pe emember, the efficiency of a correlation coefficient of .45 is 
in the per cent better than chance. Unless many other factors are used 
prediction, a counselor will go wrong much more often than he 


will Е 
go right when he uses such an instrument. 


Paper-and-pencil Tests 
ence of mechani- 


, indications of the pres' 

tests of information, by matching of 
belong together, and by figuring 
pictured situation. While these 
]l alike in requiring no 
rticular performance 
ts are reviewed here: 
(2) the Mellenbruch 
Tests for 


ү. мены үне tests 
Pictures T are secured through 
out «ч eden which in some Way 
tests dif the result would be from а 
Шег widely in their content, they are а 

ines or any pa 

swer. Four tes 


Form Board, 


ei : 
шс Aptitude Test for Меп and Women, (3) Aptitude 
pations, and (4) the Differential Aptitude Tests. 


the E. Revised Minnesota Pape 
Present c y Beta and a more complicate A 
MES th: edition requires the subject to recognize ) 

ree at figure which represents the two figures which are separated. 
T illustrations (Fig. 25) will make clear the nature of the 


с i ' H п . 
Se illustrations show that the problem here is to discriminate pat- 
how that the test correlates .25 to .30 


ths i t 
"Ds in two dimensions. Studies $ \ 
40 to .45 with some of the semi- 


an outgrowth of 
board of O’Rourke. The 


New York. 


Wi 
nd success of inspector 

Оссц, $ 
lo Pations because it had low correlations with intelligence sc 

€ evidence thus far accum dicate that high 

-ve of (1) ability to 

ee А 

Buros Stuit, Dewey B., Third Yearbook (Oscar K. 


th 
sk; Srades in descriptive geometry; - 
e с 
Ts.” Some investigators found the test of less value in these various 
ores an 
со: А 1 
‹ , relatión with mechanical-aptitude tests. = 
‹ manual claims, and with some justification, the following: 
Te ulated appears fo indi t 
A * on this test are predictive learn mechanical draw- 
Ite 
з goats by permission from the psychological Corporation, 
d The Mental M: casurements 
#4764), 1 
Маша, ied 677. 


324 PROBLEMS OF MEASUREMENT 


ing and descriptive geometry; (2) success in mechanical occupations; 
and (3) success in engineering courses." They base their contention 
about the geometry on a correlation of .25 to .30 (certainly not too soli 

a base) and their prediction about engineering on the fact that engineer- 
ing students scored higher on the test than did others. Success 10 
mechanical prediction rests on a study by Crawford (1941)! which 
indicated that this test was superior to others in predicting mechanica 
ability. The reliability for one form is reported as .85 and for both 
together as .92. Norms based on a heterogeneous population of 5,000 
subjects are available, The revised edition is machine-scored, but the 
norms for this edition are based on only 548 white enlisted men. Because 


(ЖУЛКА QR IDeA 
О-О) 
V WIE SIND 
ee уч кые 52 к ЧЫЧ 
NG Oly ae 


1 Crawford, John Edmund, 
Achievement in Elementary Machine p, tail 
ear, 


а 
/ of Some Factors upon Which » desi 
University of Pittsburgh, 1941. 


Drafting, unpublished doctor 


MEASURE: 
NT OF FINE ARTS AND MANUAL 
ARTS 


Low UL 


nical A 
jssion of Paul 


Test for Men and W 
L. Mellenbruch.) and 


Fi 
G 
Ma; 26 
акр т Ра 
ch leone 3, Mellenbruch Mecha! 
s with numbers. (By Рей 


6 PROBLEMS OF MEASUREMENT 
32 


were 

n. The tests which showed differences between iom а 

a ial that the test as a whole shows only a 6-point diffe ay Нб 

d = for comparable ages. This characteristic, however, s 

mcs ао 50 many tests have shown such a decided a be 
pie boys and girls in this capacity that the difference m: 


th boys 
real fact. At any rate, this test can be successfully used for bo 
and girls, both men and women. 


gence. In this case t 
Which are low enough 
for Grades 7 to 12, 
Occupations. 


ide whether 
Two uses are clearly indicated for this test: (1) to help decide x dicate 
a student would Profit from courses in manual arts, and (2) to Я 


"em h z К a CO 
an individual's aptitude for those occupations which require 
Siderable amount 


М c 7 to: 
he correlation coefficients ranged from .1 


{дей 
ovide 
to be satisfactory, Satisfactory norms are pr 
colleg 


E 
А chant 
€ freshmen, and a wide range of me 


as 
of mechanica] ability. The manual recommends 
follows: he test 
1. That an individual who receives fewer than 30 points on th 
be not employed for mechanical work. d for 
2. That an individual who receives 30 to 40 points be employe 
simple routine manual tasks. 
3. That an individual w 


ed t? 
ho receives 40 to 55 points be employ 
perform complex but routine tasks. 


n d ii 

- That an individual Who receives above ss points be employ? 

perform tasks demanding mechanica] ingenuity, 
In the third place, th 


Н г е test of mechanica] aptitude, Form e 
second of six Aptitude Tests for Occupations! T, consists altoge 


pictures and drawings and contains the following items: 


1. Objects or tools and their use 
2. Patterns which represent obj 
3. Patterns that fit designs 
4. Motor driven shafts and 
5. Names of joints 


is the 
her 0 


MEASUREMENT OF FINE ARTS AND MANUAL ARTS 327 


Ew minutes are allowed to take the test and the answers are re- 
TES ede separate sheet. Examples are shown in Fig. 27. This test has 
ptit = rtially validated by correlating it with other tests of mechanical 

udes and with courses in machine shop (.40) and mechanical draw- 


ections: s 
s: Mark, as you have been told, the lette? of the wheel which is turning in the directio 
ion 


indicated. 
49 W 
ich lettered wheel is turning in this \_Al direction? 


Directions: Mark, as you have been told, the letter of the proper pattern to use in making the object. 


c 
С> \ 
ERNS 


he design on the left. 


50 


PATT: 


OBJECT 


Directions: Mark the letter of the figure 97 the right which will exactly fit t 


эз PRS 


e 
F 3| 
mechanical aptitude. (Roeder and 


IG, : 
Gra Aptitude Tests for Occupations, 


ledge, mechanical comprehen- 
bility the correlations range 
for men and boys range from 
d women the coefficients are 
robably too short for high 


In, 

i5, 9. With tests of mechanical know 
Tom spatial relations, and mechanical a 
ge 41 to .64 (manual). The reliabilities 
bye. age 13 to .83 at age 9. For girls an 
*What lower. As а whole the test 1s P 


328 PROBLEMS OF MEASUREMENT 


reliabilities. There is a chance that increase in the time allowed would 
improve the test. 


recognized in the parts, Tt t ; Punctuation, or spelling 2° ^| its 
à liability j akes 25 mi : . nd 
verage reliability is .88. Minutes to administer, ® 


1 Psychological Corporation, New York, 19 
I 47, 


ASUREMENT OF FINE ARTS AND MANUAL ARTS 329 
re being studied. Correlations have 
marks, marks of separate subjects 
hich purport to measure the same 
hemselves to be equal to and in 


„нын tests have been and a 
inteligente ty with average school 
abilities ches and other tests W. 
many ср us far the tests have shown t. 

s superior to other tests in this field. 


SUMMARY 


Tes 
and 02) ol areas of the fine arts have been considered: (1) for music 
озен In both these areas, tests for capacity and tests for 
there hay ч and appreciation have been introduced. In each case 
T M een attempts to analyze the larger area into its fundamental 
the "en istics. In both cases, the combination of elements did not make 
i e. In music, there seemed to be other factors in addition to 


Pitch 

гі Р : У в 

» rhythm, timbre, intensity, time, and memory. In art, color, line 
nory did not constitute the whole of ae 


Propo 5 
r : 
art üben perspective, and men 
Eood apr; although efficiency in these characteristics was indicative of 
" NEG in art 
515 : à B : 
аз tests i achievement in music an 
of aptitude. Real achievemen 


d in art were not as well developed 
Product ) t in both music and art consists of 

S which have to be rated. Aspects of musical achievement are 
ition of tunes from the written 
ured by the ability to copy а 


esi 

у тв described man, or cons 

із alwa. lability of these tests is generally satisfactory. Their validity 

achieves in doubt because of the lack of an indisputable criterion of 

č ment. If, for example, We corrclate the Seashore music test with 
e are using а criterion which is com- 

and intelligence. The criterion of 

when used it gives no 

e wish to 


00 
Poseq P iei ina music course W 
arks in music, class attendance, 

Сет 11 art courses also is a mix 
f the characteristic w 


nd a similar story- We do have good 
ver, cover adequately the 


he 

ny - 
tes ve turn to manual arts we fi 
Test of manual arts are 


ts 
objectis, aptitudes. None of the tests, howe 
divided es set down as outcomes of instruction. 
х ] and related information, and (2) tests 
rmance may, for example, be meas- 


f аср. Into (1) tests of technica 
of wood is fashioned into an 


u 
Obj y performance. Actual perform 
S е efficiency with which a piece 


ct d 

тр. scribed i zie drawing: 
et in a working Отам! о... - ч 
ests of mechanical aptitude and ability are divided also into (1) 


а 9 1 E к 
Sse information about mechanical ability, and (2) mechanical 
ting tests of information great 


Car, Уо 

е г perf Я ; 

pedore n d getting so-called tests of mechanical 
Our best test in this 


th intelligence 


330 


PROBLEMS OF MEASUREMENT 


anner 

is the Minnesota Mechanical Assembly Test because i RIS 2G 

Lap h it was constructed. The builders of this test took the Once this 

= Mes т criterion of success which could be depended upon. : against 

stab ваа they could check the items of their tes ts in the 

m ensure their efficiency. There are few satisfactory «T are 
field of home economics although rating scales and chec 


available. 


TICS 
LIST OF TESTS OF MUSIC, ART, HOME ECONOMICS, 
AND MECHANICAL ABILITY 


I. MUSICAL APTITUDE 


1. Seashore Measures of Musical 
Talent, revised edition, grades 5-16 and 
adults. 1919-1939. Two series of three 
records each, Series A, for the testing of 
unselected groups in general surveys; 
Series B, for the testing of musicians and 
prospective or actual students of music. 
Blanks on which to record judgments. 
Time: 60-80 minutes, Authors: Carl E. 
Seashore, Don Lewis, and Joseph S. 
Saetveit. R.C.A. Manufacturing Com- 
pany, Inc., Camden, N.J. 

2. Kwalwasser-Dykema Music Tests, 
grades 4-16 and adults. 1930. One form. 


Time: 60 minutes. Authors: Jakob 
Kwalwasser and Peter W. Dykema. 
Carl Fischer, Inc., New York, 


3. Drake Musical Memory Test, A 


ages 8 and over, 
€: 25 minutes, 


: Raleigh M. Drake. Public 
School Publishing Company, Blooming- 
ton, Ill. 

4. Musical А; 


ptitude Test Seri 
grades 4-10, 1959 Paar ay 


r = - Tests given with 
piano. Time: 40-50 minutes, Authors: 
Harvey S. Whistler and Louis Р. 
Thorpe. California "Test Bureau, Los 
Angeles, Calif. 


П. Musicar ACHIEVEMENT 

1. Beach Music Test, grades 4-16. 
1920-1939. One form. Time: 40 minutes, 
Authors: Frank A. Beach and Н. E. 
Schrammel. Kansas State Teachers Col- 
lege, Emporia, Kans. 

2. Knuth Achievement Tests in 
Music, grades 3-12. 1936. Two forms, 


es 975 
Three levels. Division d, [we с, 
Division b, grades 5-6; Div finutes . 
grades 7-12. Nontimed (40-45 tucationol 
Author: William E. Knuth. Ee 
Testing Bureau, Minneapolis. 
3. Strause Music Test, ids idi 
1937. Three forms. Time: 60 eros: 
A general achievement Ex Schram 
Catherine E. Strause and Н. Р. College 
mel. Kansas State Teachers 
Emporia, Kans. 
ги Kwalwasser-Ruch Tests of 


1927. Ten parts. Authors: M 
wasser and G. M. Ruch. 
Educational Research and 
University of Iowa, Iowa " 
40-50 minutes. ; 
TM Test of sai ades 
formation and Арргесіайо im e 
9-16. 1927. One form. Time: Bureau 9 
Author: Jacob Kwalwasser- cês 
Educational Research Rum ^ 
University of Iowa, Iowa City 


Ш. Art лог? 

1. Horn Art Aptitude. m pe 
preliminary form, 1944 к min das 
12-16. One form. Time: 


N. t 
T, n 
tute of Technology, o dgme 
2. Meier-Seashore Опе 29. , 
Test, grades 7-12. 1929-1930. -80 m 
arles 
utes. Authors: Norman Кш an 
and Car Emil Seashore esearch у2 
Bureau of Educational b Тока» 
Service, State University 0 
City. 


ASUREMENT OF FINE ARTS AND MANUAL ARTS 


З: H 
oe ES Test, I—Art Judgment 
Моше orm (100 paired pictures). 
Norman etn minutes). Author: 
DIESE harles Meier. Bureau of 
а; onal Research and Service, Uni- 
5 тен, Iowa City. ' 
colleges, а any Art Test, all grades, 
a folio a ee art schools. 1929. One form, 
minutes) qne. Nontimed (about 90 
(sd a uthor: Margaret McAdory 
bu Bureau of Publications, 
New Wo eem Columbia University, 
aea Fundamental Abilities of 
orm, Th s, grades 3-12. 1927. One 
Utes), ДА parts. Time: 30 (35 min- 
text), Т x Alfred S. Lewerenz (see 
ngeles, мы Test Bureau, Los 
1-16 Каша: Art Ability Test, grades 
Nontime d 1932-1935. One form. 
ordan K (180 minutes). Author: Alma 
i m (see text). Published 
hor, Cincinnati, Ohio. 


IV. Home Economics 


l E 

Test, „речи Home Economics 

and B, Ti es 5-10. 1931. Two forms, À 

E ае 60 minutes. Authors: Edna 
оок Co and John L. Stenquist. World 

Print). трапу Yonkers, М.Ү. (out of 


2.5 
Апа, um High School Tests for Indi- 
p assistin; 7-8. 1945-1946. Four tests: 
ie, (2) ng with care and play of chil- 
ч 5 ү шк with clothing prob- 
nd (4) helping with food in the home, 
pene: 28 oe with the housekeeping. 
°з 1 Ali inutes for each test. Authors: 
QU Testo Stair and Muriel б. McFar- 
cud CFarla. Elizabeth Anderson, Muriel 
R з Тедо апа Kathleen McGilli- 
rud М? Elizabeth Anderson ап 
aim, к ficFarland; Test 4, Evelyn 
ee MeGillicuddy, and 
$ cFarland. State High chool 
ayet: ervice, Purdue University; 
E 3; nd Ind. 
ay ig e High School Tests for Indi- 
«ада 091. 1943-1947, Ser tests: 
evelopment, (2) clothing ^ 


331 


(3) clothing IT, (4) foods 

p preparation, (5) P ln 
or family food needs, (6) home са "e 
the sick, (7) housing the family. Tin e 
55-60 minutes. Authors: Test 1 Robert ү ; 
Kelly, Alice Stair, and Muriel G. Мега : 
land; Test 2, Mary I. Healey, Teone: 
O. Parvis, and Muriel G. McFarland: 
Test 3, Mary I. Healey, Ruth Davis 
Moutoux, Jeannette O. Parvis, Lo em 
Stedman, and Muriel G. McFarland: 
Tests 4 and 5, Mary T. Swickard and 
Muriel G. McFarland; Test 6, Jeannett 
О. Parvis, Gleela Ratcliffe, Ruth Davis, 
and Muriel С. McFarland; Test т 
Jeannette О. Parvis and Muriel G. 
McFarland. State High School Testin ; 
Service, Purdue University, ee 


Ind. 
4. Minnesota Check List for Food 
Preparation and Serving, revised edi 


945. Author: Clara 


tion, grades 7-16. 1 
f Minnesota 


M. Brown. University 0 


Press Minneapolis. 
5. Minnesota Food Score Cards, high 


school and college. 1946. Author: Clara 
M. Brown. Cooperative Test Service, 


New York. 
6. Unit S 
and Househol 
7-9, 1933. Tw 
minutes). Authors: 
Clara M. Brown. Ed 
Bureau, Minneapolis. 
7. Tests in Comprehension of Pat- 

2. 1927. One form. 


Nontimed. ‚ L. Stevenson and 


M. Trilling. Pu 1 Publishing 
Company, Bloomington, Til. 


cales of Attainment in Foods 
d Management, grades 
o forms. Nontimed (50 
Ethel B. Reeve and 
ucational Test 


terns, 


Ve MECHANICAL ABILITY 


1. O'Connor Finger pexterity Test, 
10 metal pegs ОТ 


13 years and above. 3 Я 
ins, 1 inch in lengt ;a metal plate with 
100 holes, each hole large enough for 

i i with the fin 


hole until all holes are А 
minutes. Stevens Institute of Tech- 
nology. oboken, 


332 PROBLEMS OF 

2. O'Connor Tweezer Dexterity Test, 
about 13 years and above. 100 metal 
pins as above; subject picks up one pin 
at the time with small tweezers and 
places one pin in each hole. Time: 8-10 
minutes. Stevens T of Tech- 

boken, N.J. 

Ws Manual Dexterity Test, 
13 years and above. Consists of four 
rows of 15 blocks each. Score is the time 
it takes (1) to pick up the blocks with 
one hand and put them in the hole, or 
(2) to pick them up with one hand turn 
them over with the other and put them 
back, or (3) to move each block to next 
hole above. Test of speed. Author: 
W. Z. Ziegler. University of Minnesota, 
Minneapolis. 

4. LE.R. Assembl 
shortened form, Originally constructed 
by Н. А. Toops, and Shortened by 
Emily T. Burr and Zaida M. Metcalf, 


Time: 25-30 minutes, Norms adopted 
by Burr and Metcalf fr 


y Test for Girls, 


om experience. 
C. H. Stoelting, Chicago, Ill. 
5. O'Rourke Mechanical Aptitude 


е 5 minutes 
› 25 minutes, Psychological 
Corporation, New York, 


6. Stenquist Mechanical 
Tests I and II, h 


is made up of 95 


1. Describe the main features of Sea. 
Shore's Measures of Musical Talents, 
What success has i. had as a Predicter of 
musical accomplishment? 


MEASUREMENT 


m 
7. Revised Minnesota ropes 
Board (see text), boys aged 9 i» rt ап 
and men. Authors: Rensis ee CE 
William Н. Quasha. Psychologica 
ration, New York. ical 
P po Test for Н не: 
Ability (see text). Author: T. au, Los 
Quarrie. California Test Bureau, 
les Calif. bly 
Ecce Mechanical aen: 
Test, junior and senior high r^ Pater- 
men (see text). Authors: D. Anderson, 
son, К. M. Elliot, L. D. Я Ар- 
and Edna Heidbreder. Marie 
paratus Co., Marietta, Ohio. ; chanical 
10. Prognostic Test of ү Time: 
Abilities, grade 7 to adult. 19 Wright- 
45 minutes. Authors: J. Wayne lifornia 
stone and Charles E. O'Toole. ^ 
Test Bureau, Los Angeles, Cali Ds Test; 
11. Minnesota Spatial .-cc НО " 
upper elementary grades, higl бапда! 
and adults. Consists of four m eac 
form boards, A, B, C, D. mone Я form 
form board, 58 pieces ШЕ, piece? 
and size are cut. Time to put г; is the 
back into boards B, C, and ractice): 
Score (board A is used for P' ponald 
Time: 15-45 minutes. Ашу. Й 
С. Paterson, Richard wn and 
Dewey Anderson, H. A. КЕ aratus 
Edna Heidbreder. vc o ApP " 
Company, Marietta, Ohio. А 4 
12. Mellenbroch Mechanical. Af, 
tude Test for Men and Wom Author! 
text), grades 7-16 and adults. esate? 
E L a pen 
Associ icago, Ill. he 
I5. Test of Mechanical Соту, 
sion, grade 9 and over. ne eee (se? 
<. Bennett and Dinah Е. New 


d 
text). Psychological Corporation 
York, 


QUESTIONS AND EXERCISES 


г tomt” 
2. What other factors enter era ad 
cal accomplishment in addito m 
included in the Seashore tests which и 
3. Explain the difficulties 


MEASUREMENT OF FINE 


ter int 

Jnto the measurem i 

achievement. ent of musical 
- Summari ; 

ifsic: marize the uses of tests in 
5. W 

the uet are the salient features of 

E See imper Art Judgment Test? 

Art Test. it in detail with the McAdory 

the | . What are two weaknesses of 
a t test? 

КТТС у B DEUS that the Knauber 
à н, ity Test is well named? 

tests ie are norms of achievement 

fine arts established? 


ARTS AND MANUAL ARTS 333 
are discovered with a proved i 
in academic subjecte? Мебари 
9. Describe the procedure used (a) to 
construct the Newkirk-Stoddard Home 
Mechanics Test, and (b) to establish 
the criterion for the Minnesota Mechan- 
ical Assembly Test. Why is this latter 
procedure so highly regarded? 
10. How is the predictive capacity of 
a test indicated? How efficient is a pre- 
diction based on a correlation of .60? 
11. What explanation might be ad- 


vanced for including the measurement 
home economics, and 


8. Ww : 
ink a is the correct procedure in of music, art, 
to manual arts when students mechanical aptitude in one chapter? 
BIBLIOGRAPHY 
cian. New York: The Ronald Press 


I. Music 


Drax 
and go КАшпон M.: “The validity 
Talent liability of Tests of Musical 
(1933' Journal of Applied Psychology 
15) LT441-458. 
: Sine DU, PauL R.: “Are ‘Music 
Intelli 5 T'ests More Important than 
everal qu Tests’ in the Prediction of 
ypes of Musical Grades?” 


ourn А 
347-359 Applied Psychology (1935) 19: 


G 
of кы Epwarp B.: Measurements 
York; aU Behavior, pp. 425-438. New 
iore Odyssey Press, Inc., 1941. 
Talent SMITH, КЖ Кы Selecting Musical 
(1959y*, „/0и”та! of Applied Psychology 
Too 13:486-493. 
Negro 60% Guy В.: “A Summary of 
Talent Pu. on the Seashore Musical 
ests,” Journal of Comparative 


SYch 
Куо баУ (1931) 11:383-393. 
„Тн, WiLLIAM E.: The Construction 


an 
al Н 

8 ce of Music Tests Designe! 

Regg eure Certain Aspects of Sight 


айар 
Universi unpublished doctor's thesis, 
Munsee of California, 1932. 
Musi LL, James L.: The Psychology 
отра, New York: W. W. Norton 
" Sea 1937, З 
шїї, cling Success in the study of 
j^ ү йв Administration Tech- 
Senor. letin TB7-77, Dec. at, { 
їс; N, Max: The Psychology D 
Survey for Teacher and Muse 


My, 


Company, 1940. 
SEASHORE, CARL E.: Psychology of 


New York: McGraw-Hill Book 


Music. 
Company, Inc., 1938. 

—— —4 In Search of Beauty in Music. 
New York: The Ronald Press Company, 
1947. 

SrANTON, HAZEL M.: Prognosis of 


Musical Achievement. Rochester, N.Y.: 
Eastman School of Music, University of 


Rochester, 1929. 
II. ART 


CARROLL, HERBERT A.: “What Do 
the Meier-Seashore and the McAdory 
Art Tests Measure? Journal of Educa- 
tional Research (1933) 26 661-665. 

FAULKNER, Ray: “ Standards of Value 
in Art,” “Art in American Life and 
Education,” Forticth Yearbook of the 
National Society for the Study of Educa- 
tion, Chap. XXVII, рр. 401-426. 
Bloomington, Ill.: Public School Pub- 
lishing Company, 1941. 

———— An Ex perimental Investigation 
o Develop Tests to Measure Art 
ding and Appreciation, un- 
doctor’s thesis, University of 


Designed t 
Understan 
published 
GREENE, EDWARD В.: Measurements of 

;or, Chap. 13. New York: 


Human 
Inc., 1941. 


The Odyssey Press, 


334 PROBLEMS OF 
Kintner, MADALINE: The Measure- 

ment of Artistic Abilities. New York: 

Psychological Corporation, 1933. 

KNAUBER, ALMA JORDAN: “The Con- 
struction and Standardization of the 
Knauber Art Tests," Education (1935) 
56:165-170. 

LEWERENZ, ALFRED S.: “Predicting 
Ability in Art,” Journal of Educational 
Psychology (1929) 20:702—704. 

Meter, Norman C.: “Recent Re- 
search in the Psychology of Art,” “Art 
in American Life and Education,” 
Fortieth Yearbook of the National Society 
for the Study of Education, Chap. XXVI. 
Bloomington, Ш.: Public School Pub- 
lishing Company, 1941, 


ПІ. MANUAL Arts 
Bascocx, Harrier, 


Runes Emerson: “An Analytical Study 
of the MacQuarrie Test for Mechanical 


Educational Psy- 
5 


and Marion 


: А Summary о 
Manual. and Mechanical Ability Teste 


Я : : 
on Psychological Corporation, 


MEASUREMENT 


Bincnam, WALTER VAN DYKE: at ) 
tudes and Aptitude ee New Yo 
Harper & Brothers, 1937. 

— JUNE C. MacQuarrie Т. ue 
Mechanical. Ability. Los Angeles, Calit. 
California Test Bureau. 3 

Hornine, S. D., and RUTH 5. n 
ARD: “Testing Mechanical Ability v 
MacQuarrie Test,” Industrial Arts 
azine (1926) 15:348-350. m- 

Morsan, W. J.: “Some Remar ES 
Results of Aptitude Testing in Tec a 0 
and Industrial Schools,” Journa 
Social Psychology (1944) 20:19-29. 

Newkirk, Lours V.: Yahaan ies 
Testing Home Mechanics Content, ' orsity 
in Education, Vol. 6, No. 4, Univ! 
of Iowa, 1930-1932. oM 

Paterson, Doxarp G., et al.: 
sota Mechanical Ability Tests. р 
neapolis: University of Minnesota 
1930. i “A 

Perry, Fav V., and M. E. Broo acher 
Study of Standard Tests and of e e 
Made Objective Tests in (1932) 
Journal of Educational Research 
26:102-104. or 

Srov, E. G.: “Additional Tests f 
Mechanical Drawing Aptitude, 
sonnel Journal (1928) 6:361-30 ор 

TIFFIN, Јоѕерн: Industrial - c Hall, 
ogy, 2d ed. New York: Prentic 
Inc., 1947, 


aud 


inne- 
Min- 
ress, 


CHAPTER 13 
M ; j 
easurement of Physical Education and Health 


tion and general health of the draftees i 
ee 
d the Second World War have а 


Shown 
ni ш of thousands of our young men were in such poo 
forces, г 5а that they were doubtful risks as members of our bes 
еш nowledge of such conditions has brought about a renewed 
of all peopl e improvement of the physical condition and general health 
e uation e. Particularly has this movement influenced the physical- 
Гега т ograms for all students in our schools and colleges. 
er areas of instruction, improvement comes with a greater 


‚ Bree : 
of certainty when (1) objectives are clearly defined, (2) measur- 
rd the objective are pro- 


Ing in 
‚8 Ins hos odes 
Vided илеш which indicate progress towa 

» and (3) procedures of instruction are modified in the light of 


Object: 
Jective measures. 


Stud; 
ы of the physical condi 
he First World War an 


IYSICAL EDUCATION 

ducation reflect the best 
ch lists as have been 
vould agree that the 
and social effi- 


OBJECTIVES IN РР 
n physical e 
indicated by su 
ders in this field w 
lar), physical fitness, 
se of jnstruction in physical educa- 
eral agreement with La 


more detailed list are: 


That obj 
Present objectives in instruction i 
aea phy in education is 
Evelopm y its teachers. Many lea 
Geney ¢ ent of skills (neuromuscu 

7 Constitutes the general purpo 


tio; 
Orte’s ms undoubtedly would also be gen 
nalysis of objectives.” Included in this 
ic, rhythmic 


pter "en development of skills—athletic; gymnastic, aquati 
ime. mediate educational purposes 45 well as for use later in leisure 
is would involve also а knowledge of the rules, techniques, etc., 


Cerra: 
"rtain skills, 

e 
Velopment of social standards, appr 


eciations, and attitudes by 
Hagman, Tests and 


E. Patricia 
B. Saunders 


1 
S 

м," Boy 

в а 

Co Prep, ard, John F., Frederick W- Cozens, and =. Наа 

М m йу tour” Physical Education, за ed. P- 5. Philadelphia: W. 

io» Porte, Willi -__ opjectives of He 

> ; Will E fajor Objectives % ©. 

meo ei ealth and Recreation 

.L 


b. ali, S 
6. p Yfornia Physical Education H ‹ 
sor William 
335 


alth and Physical Educa- 
Journal, January, 1936, 


€rmissi 
15: 
sion for use from Protes 


336 PROBLEMS OF MEASUREMENT 


ioni avorable 
means of intensive participation in sports and games under f 
itions of leadership. | | аре" 
enimad of certain personality traits such as ee cach 
tmd and self-expression, which come as a result of y S su 
infimi participate in certain activities. Such participation als 
in development of leadership capacities. 


T 3 at they 
4. Development of safety habits in actual life situations so tha 
will be continued in later life. 


>» which 
5. Elimination of those physical defects, such as bad posture, wl 

emediable. „ aad 
К Development of essential health habits, health kei. life 
health attitudes in such a way that they will function in the chi 
during school and later when he becomes an adult. — that 

From this list, only slightly modified from the original, it 1s e её теп 

the aims of physical education are abundantly worthy of pom 50 
and fit in with the improvement of the whole personality—an ! 
prevalent in modern educational philosophy. 


TESTS OF PHYSICAL CAPACITIES 
It is very difficult to distin 
moment a child i 


иу for the 
guish between capacity and ability, i 
h 


s born his eny 


o 
ing. On the oth 


/51С® 
tb hink of skills or abilities. Tests of dare е 
capacities, therefore, indicate а child's possibilities which we P 
work with and develop. 


nk 


e гей 

the two factors measur k 

, r 

Schneider Test.! The pulse tate and systolic blood pressure а on ue 
1 Schneider, E. C., “А Cardiovascular R 

and Efficiency," Journal of 1 


Tn 
ating as a Measure of Phys! 


he American Medical Association (1920) 74:1 


MEASUREMENT OF PHYSICAL EDUCATION AND HEALTH 337 


езе times during 5 minutes of rest in a reclining position. The 
Sie m hen assumes an erect position. After а delay of 2 to 3 minutes 
iir eand systolic blood pressure are taken and recorded. The differ- 
т ween the readings (1) when reclining and (2) when standing are 
em ge the general physical condition. A second part of this test 
Seis m Quen pulse rate and blood pressure before and after 
inches hi Р е exercise consists of placing one's right foot in a chair 18 
др igh and then bringing the left foot slowly to the side of the right 
tate is ce every 3 seconds for 15 seconds. After the exercise, the pulse 
ah read at intervals of 60 seconds, 90 seconds, and 120 seconds. 
à E are furnished which make the scoring easy. The total points are 
Th a perfect record. A score of 9 points or less indicates deficiency. 
e Harvard Step Test, developed during the Second World War, 


Ое = 3 А Р 
S not bother with pretesting but uses much more intense exercise. In 
ad down from, a 20-inch 


met a the exercise consists of stepping up on ar [ 
the ечен 2-second intervals, 30 times а minute for 5 minutes, unless 
lm. diy idual is unable to continue before the expiration of the specified 
hearth Beginning exactly one minute after he stops, count the number of 
загу; eats for exactly 30 seconds.""! Only two observations are neces- 
Me (1) the duration of effort, and (2) the number of heartbeats. By 
boar of a table it is possible to substitute these two variables and read 
is 50 ‚у an index of efficiency. For normal healthy young men this index 
: Those men in poor physical condition score below 50 and those in 


0. 

К Se аси score above 80. 

cache 5 these individual tests are undou 

Which т of physical education desires a group 1051 t À 
can be administered to 20 or 30 pupils at one time. Such a test is 

i Michigan Pulse Rate Test for Physical Fitness.” In this test the 


ch : 
waren are first taught to count their own pulse. After this process has 
" ir own pulse while standing at ease and 


Cen w 

ell learned they count thei 
r i ds on the blackboard. The 
then runs in place at per second for 15 sec- 
Ris They must lift their feet 6 inches high at least. They again count 
mi Pulse 1 minute after exercise, 1 minute after exercise, and at 
inute and 3-minute intervals after exercise. They record their counts 

€ black 

I ckboard. . . EN 
it m the child's pulse returns to normal after 14 minute his score 18 А; 
ter 1 minute, B; after 2 minutes, C; after 3 minutes, D;and E if it 
y longer than 3 minutes. Tf his pulse is irregular his grade drops one 


btedly efficient, the ordinary 
even though less precise 


Tank 
" А 
р. aj rehouse тенсе, End Augustus Т. Miller, Jr-, no Exercise, 
14 St. Louis: ee any, Medical Publishers. ; 
ouis: The C. V. Mosby Co gos” ‘American Physical. Education 


Review m Education in the State © 
1920) 25:138-139. 


338 PROBLEMS OF MEASUREMENT 


:ncinle, is the 
À second test, more inclusive but built on the same шше, four 
California Group Functional Test.! This test may be divi 
: Бу :on to age 
а the first part the body weight is considered in its Pp 
and height. Needed figures are secured from the American 
jation. : " faces 
Li second part, the breath-holding test, children Mert e 
oriented toward the blackboard hold their breath as long A P bales: 
while the leader counts aloud elapsing seconds. When each chi 
he records the time on the blackboard. and after 
3. The third part has the children count their pulse befi pe Де facing 
doing 25 forward body bends in 30 seconds. The children w е for 0 
the board count their pulse for 30 seconds, then stand at еа 


n 

ount 0 
seconds, then count their pulse for 30 seconds and record the с 
the blackboard. 


4. The records for th 
mile are also kept. Supp 
are excused from the te 
up during the test, (3) 


5% 
€ potato race for the girls and for me 0 
lementary data are collected (1) of chil He k 
st at their own request, (2) of children in sto 
of children that the leader thought it best s afte! 
during the test, (4) of children that showed marked breathlessnes t 


à e 
the test, and (5) of those that showed marked fatigue. There 15 ehe. 
of the reliability or validity of these last two tests. In addition, t reco 
errors appearing in the record because the children might not 
their pulse rates correctly. cher of 
“The cardiovascular tests are of limited use to the average tea point 
physical education,” ‚ Cozens, and Hagman.’ Ше ет” 
ty of such measures is affected by age, 56% 
perature, climate, humidity, emotional conditions, and altitude- 


TESTS ОЕ STRENGTH cu 
+ H i а 
Along With genera] Physical fitness as indicated by the сагӣіоуа? sj 
tests is that of i i 
5. 
back, and legs ar measured by a variety of dynamometer 
word dynamometer comes f 


, by chinning, and b 
of lifting an 8 
fatigue sets i 
which works over a pulley and 
capacity is measured by the 


1 Stolz, Н. R., “Grow 
Sacramento, Calif.: Cal; 
Education, 1923. 

2 Оў. cit., р. 87. 


:ddle Ae 
-Pound weight with the midd strin? 
п. The weight is attached to а unt 
is attached to the middle fing jid" 
Spirometer, into which an in 
P Functiona] 


nC 192.) 
: Tests," Circular Letter М 30, No { phy? 
ifornia State 


о 
Board of Education, Department 


MEASUREMEN 
REMENT OF PHYSICAL EDUCATION AND HEALTH 339 


cked into his lungs. What the 


brea 
thes all the air he has previously ра 
of combining these 


Physical- ; 
different AMNEM teacher would like is some way 
easures of strength into à simple index. 


The Rogers Strength Index 
ecommends itself because of its sim- 


Th 
€ Rogers Strength Index’ r 
dex is secured by adding the scores 


plici 
En effectiveness. The in 
in the following manner: 


d 
2 eee of cubic inches in lung capacity? 
3. E of pounds pressure in right grip 
4. а of pounds pressure in left grip 
E Vire a of pounds lifted, using back 
E. er of pounds lifted using legs 
trength of arms (pull-ups + push-ups) X [e -+ (height in 


NAM inches)] 
vos SAN фасе index may b 
Author clai e strength index by 
One-half га that this test is а 
“curate imes as accurate as the use of 
ests can be the optimal combination 
e given at the rate of one boy 


Com 
з put: " 
ed in a few seconds. Furthermore 
h index also is highly reliable. 
e very 


do not correlat 


om the strength index by 
ht norms times 100. Its 


highly valid measure, that it is two and 
weight alone and almost twice as 
of age, height, and weight. All 


per minute an 


e computed fr 
age and weig 


ups) + 1 (leg lift) as а 

This index correlates .49 with the athletic 

hted tests have been devised for measuring 

OU igh school students. The indices are: 

Oys' strength: .1 broad jump “+ 2.3 shot-put (4 pounds) + weight 
+ 3 shot-put 4 weight 


Redirection of 


M casurements Programs in the 
of Publications, Teachers College; Columbia 
nistration of Physi- 


College, Columbia 


Ogers, F ы 

Un tal аре Rand, Tests апі 
2 Versity ion New York: Bureau 

с Оре ? Я 

Un Po а Rand, Physical Capacity Tests in the Admi 
a versity foe York: Bureau О Publications, ‘Teachers 
Andere, Я 

ability m Theresa W., “ Weighted Strength Tests for the Prediction of Athletic 
Uh, pj High School Girls,” Research Quarterly of the American Association for 

Mes (S iiid Education and Recreation 36) 7136-142. ' ў 

of Homo ry, Edgar, “A Simplified Method of Classifying Junior and Senior Boys 

Ip е An geneous Groups for Physica ation Activities Research Quarterly 
tosga ean Associahion {or г ОИ, Physical Education. and Recreation (1941) 


340 PROBLEMS OF MEASUREMENT 


TESTS or Posture 


Good posture is more of a condition than a capacity. о the 

The best measures of posture are obtained from photographing ie 
subject against a board marked off in quadrilles. The subject hg etm 
a turntable placed at a known distance from the quadrille ай 
Photographs are made of the individual from different positions ty 
measured results can be secured quickly and accurately. Ошого. 
most schools are not equipped with cameras, dark rooms, and qua 


ll 


À 68 98 
Fic. 28. Samples of silhouette scale (Clifford L. Brownell). (By permission of Р 


еве, Columbia University, New York.) 


120 


ur ca? 


ct 
posture becomes more a matter 


xo 
: of € 
measurement. co 


Sins € Cozens, and Hagman ор. cit pp. 42-45 
moe = е L., A Scale for Measuring the Anterior-Posterior 
we rade Boys, Contributions to Educati M 5. New York: 
Publications, Teachers College, Columbiji. eg 


а University, 1928. 


0) 
post ur? of 
urea” 


MEASUREMENT OF PHYSICAL EDUCATION AND HEALTH 341 


g at the top moves the sample 


nex 
t one seems better and then startin! 
The average of the two scores 


down unti 

n 

шз til the next one seems worse. 
ecured is the child's posture score. 


Trsts or MOTOR COORDINATION 


e considered here is that of motor 
f reaction, strength, breathing, 
his integration of action 
f under the term “motor 


олар of these capacities to b 

cte., wo e How do quickness of Fee 

чр together in performance? It ist 

Coordin toward a certain goal that we think o 
ation." 


of Motor Ability Tests! 
atteries of 10 events each, 
ble for ages 8 to 18. The 


Brace Scale 


Thi 
T Scale or set of tests is made up of two b 
E easy to give and to score. It is suita! 
ing samples indicate the nature of the tests: 


4 Walking in a straight line, heel to toe, for ten steps 
- Kneel on both knees, with arms folded behind the back and stand 


- Full turn left in the air and land without losing balance | 
han, d Jumping through a loop formed by grasping one toe with opposite 
o Bend forward, place both hands on the floor, raise the right leg, 
ch forehead to ile floor, and stand without losing balance 

19. Jumping to feet from kneeling position 
30 1.198 stand for 5 seconds 

tended forward an 
{ the Brace Sca 


to 


d recover position 
le of Motor Ability 


Dess 
Te, te is als ision 0 nae 
o an Iowa Rev! tried out 40 stunts and eliminated 


е5{5 2 Д 
them | McCloy, who did the revision, : 2 
T one : 7 ; left. He retained 10 of the items 
of by one until he had 21 items 16 4l and modified the 


the d 

a lunc tl battery, 

Чер tration and scoring procedures s ; 

Don d instructions for nS ministering and scoring the test and for giv- 
€ tes 7 e clai 

Pro t to groups of subjects. Бе original test Brace thought that 


ved 

Suc the validity of the test. In s до! ^ 
for pee? "Lv would aid reatly in classifying рир! 
pe ана ааа pre =A do aid us in the study of 


Physi 

Speci Ysical education. The results 0 Е һ 
р Д f groups 1n 

bhy » Performance disabilities as well in the equating 0 group: 


Sic 
al education. 
Com- 


1 

B : Р 

Pany, Tog David K., Measuring Motor Ability New York: A. S. Barnes ап 

тебу, CR by permissi reasurements in Health and Physical Education, pp- 
>; York Appleton Cantate C Inc., 1942- 


342 PROBLEMS OF MEASUREMENT 


ACHIEVEMENT TESTS 


5 of 

Achievement tests in physical education follow the same wes 
construction as do the tests we have described thus far. The skill oF 
must be carefully selected so as to be representative of the tota lid 
ability. The test must be sufficiently reliable. It also must be Meer 
criteria against which the test is validated may be (1) scores 9 jety 0 
by the ratings of experts, (2) T-scores obtained from a rich T 7 in 
tests of the ability in question, and (3) scores from a round Г ossi- 
which each player becomes an opponent of every other one. hw i 
ble the tests should be applicable to groups. One criterion emphas 


a 

ў эре o 
somewhat different from other tests. When possible it is Басара of 
a test which may be used both as a practice test and as an ind! 
achievement.! As in other areas, nor: 


re 
ms should be computed from ы 

sentative populations, r boy® 

Achievement scales in physical education have been prepared vs test? 

and girls in elementary, junior high, and senior high schools. Thes ear Л 

along with their instructions for administering and scoring ^ ical 

three volumes.? Let us Consider first Achievement Scales in Girls 
Education Activities hi 


ot 
Я А jon 
" In this book instructi? суе 


à $ 
Vane who are classified is thus Se^, ser 
" a аз Class C. It is 
een ше only with those In his class Samples of the pall 
€tball throw for distance Jump and reach playgroup 
? 

1 Ibid., PP. 169-172 Wo 

2 Neil 3 : ue ' | 
а enh AE Corns, cine Sales i PS 
Martin H. Trieb, апар 5. Barnes an Company, 1939, Cozens, FIT for 


: Neilson, Phys; B 
» £ hysical 5, 
Hazel J. Съда BELT А. S. Barnes and Company, 1936. ria | pit 
Е roce New York: A. Si arnes SOR, Achievement Scales in Physic 
? Neilson and Cozens, op. cit., p. 6 ompany, 1937, 


in Secondary Schoo 


MEA: 
SUREMENT OF PHYSICAL EDUCATION AND HEALTH 343 


throw f 
step, and jmp. о, push-up, running high jump, and standing hop, 
chil eme for classes from A to H were computed from some 79,000 
Scores. In T the scores from each event are transmuted into standard 
boys and A $ manner, achievement scores are furnished for high school 
Weight, нег igh school and college girls. Since it was shown that height, 
itas neces age are uncorrelated with athletic abilities after age 16, 
а sary to have only one set ight i 
cede sts just described. Most o 
lon are constructed after t 


evel. З 
Practical; practical tests of considerable promise 
ball, fiel "iow they describe acceptable tests for 
hockey, soccer, softball, speedball, tennis, 


badminton, basket- 
and volley ball. 


ND HEALTH INFORMATION 


G 
con, me health is indicated, in the final analysis, by the existent physical 
Bang С at the present time. Has the subject any, disease? Are the 
and is his body working as they should? What of his eyes, ears, nose, 
0 1 amount of energy for his age? 
r who knows some of the 


ache 
1 of cases to nurse and physi- 


Another aspect of the problem relates to 
acticing those habits and taking 
ontinue good health. 

(1) health knowledge, 
practices depend upon 


MEASUREMENT А 


ere 
are two phases of 


à 
2 
both health practices. Un 
. The € knowledge and the attitudes of subjects. 
tics, 76 15 indeed no assurance that the knowledge of good health prac- 
tim will lead to good health habits. The best instruction at the present 
Tests of hea! 


ее 
mphasi ; 
are ega; Phasizes both knowing 


SL 
b al €r to construct and more С 
tie tai, Practices. To test the latter the good wi 
ice а Ed so that they will report the habits that they actually prac- 
think the tester would like for them to 


апа 
Pracred PO those which they 


l 
Ne Scott 4 A 
ewy © M. Gladys, and Esther xU Better Teaching through Testing. 


York; 
: A. S. Barnes and Company 


PROBLEMS OF MEASUREMENT 
344 


ivided into (1) 
owledge Tests! are divide es 110 
2 Werder " bre E. and Q) advanced tests for ed ihe 
y mem is g ian for each division. These tests have aie d аге 
= eR 1925 and were revised in 1937. * The Mic sis of mor 
рар ive curriculum research involving an ana ie source 
ia gue orbidit and accident statistics, popular hea 
tality, x omi mei of children, of different ultiple-choic* 
pepe er 7? The elementary tests are dm тм 60 x pa ек sm 
an OOKS. rich variety of information. сев fio 
е ano of асса, how to keep en prope 
ae їп ponds, how tuberculosis is spread, the ее о 
handling of garbage and Sewage, etc. Two samples are: 


stu 
ages, and courses of 


22. The best lunch to choose in the School lunchroom is 


в 
avenge E a b 
a. Vege sop, bike pa N à 
b. Vegetable soup, baked P^ "nuce pei d 
c. Ice cream and chocolate но опе ти 5 
d. Meat, potatoes, Cun Qi ice tH дене» нен ere cse © 
€. Vegetable salad, crackers, iced {еа..... Е Е гэй watching c 
43. The best way to study about the shape and size of bacteria is - 7 
a. Under a bright ^im ie = 
b. With the naked SY usse isst ons seva pM Los let tions gs d 
Pe дык n acr кяр» co retten td е 
d. Under a microscope еб ыыы m m T 
€. Under а hand magnifying glass ел 
iple-choice !** тр 
The advanced tests are composed entirely of 60 multiple-ch , cries. E 
Which are more complicated than those of the elementary 
these tests the emphasis is 


i 

nutri pe 
upon two major fields: (1) food and des 
and (2) the prevention and 


у 
treatment of diseases, More жеч аў i 
60 items are in Some way related to these two headings d coho pt 
items on the functioning of certain organs, on the effects o a 
tobacco on growth, and 


24. Vitamins are es 


pecially necessary for 
а. Giving powe 


"ME. 
T to work and n T um 
5. Giving flavor {о їоой....,. E. 
c. Increasing health and growth,. eee АЩ 
d. Regulating body а RN pem 
е. Preventing baec T Mis Laien gun ct in j^ 
е, Со! 
Items by Permission of Bureau of Publications, Teachers College; "d 
University, New York, Е 
? Manual, P. 1; also Gates, A. L 
edge," Teach 


h 
in Healt! 
апі Ruth Strang, “A Test in 

ers College Record (1925) 26:86 a0 By permission. 


MEASUREMENT OF PHYSICAL EDUCATION AND HEALTH 345 


40. W 
. We m > 
say a person is immune from a disease when 


Ё ps not been near sick persons... ett a 
* "à ody has made substances that protect it from the bacteria that cause 

© Бебе. on ceramica ABT Ett mcn e ОКЕАНА b 

: E uà disinfected Aiesa oae reei ETES с 

6; He on у resists cold and fatigue d 

ad the disease three tiMesiava sees ane seins senem e 

s from .74 


ts a е of these elementary and advanced tests varie 
furnish ну was determined by the selection of the items. Norms 
ied § consist of distributions of scores for the elementary tests 
from ү rom a large city system, from a suburban school system, and 
o тез schools. For the advanced tests there аге score distributions 
€ from a large city high school and from a suburban high school. 
Subject ests are useful for analyzing the health knowledge of a single 
as well as for indicating the general progress of a class. They 


er somewhat from the wide variety of items tested. 
ventory for High School Students,’ 


enlist the students' cooperation in 
ices. This inven- 
condit; divided into two parts: (1) health 
lons, and (2) health information. This inventory 1s an outgrowth 

i ity of Los Angeles. 


Ith knowledge in the c 
“extensive Curriculum research 
pular health 


Vi . 
ie the analysis of textbo 
Ба and other authorities on 
ealth 1, on health conditions, is 
йы ОЕ It is on this part t 
Theas , 
ee © most common answers on the status part are * (1) Frequen Пу (2) 
Que Slonally (3) Never” or «(1) Frequently (2) Seldom (3) Never. 
tions about being sick in bed, colds, headaches, tiredness, and 


Ooth, 
ache are asked. Two samples are: 


* Do yo 
(1) pou have colds? 
& p, 'equendy (2) Seldom (3) N 
тт teeth hurt because of decay? 
equently (2) Occasionally (3) Never 


ever 


1 
Pra cher, Gerwi lth Knowledge Attitudes, Status and 
0 dy of the Health Ат cage, A е, 
Зое of eee a. Penpublished doctoral dissertation, University of 
з 16 Californi > 
Calif, anual, p, 1, e id permission from California Test Bureau, Los Angeles, 


346 PROBLEMS OF MEASUREMENT 


i i ther you 

There are 20 items on health practice. Questions as to Medi ve 

drink at least one pint of milk а day, maintain a ia iae comuni 
formed a habit of daily bowel action, avoid colds and othe 


ight are 
cable disease, or the average number of hours you sleep per nig 
asked. Two samples are: 


13. Do you ever eat candy or other sweets just before meals? 
(1) Frequently (2) Occasionally (3) Never 


3 ches? 
22. Do you use drugs such as aspirin, bromides, etc. for cure of heada 
(1) Frequently (2) Occasionally (3) Never 


r- 
is the pe 
The score of this part is a weighted one. If the answer pm pe d 

fect one the subject receives 3 points, 2 for a poorer answer, 


kes up the 
the poorest answer. The total of these weighted points ma. 
Score. 


Part II of this invent 
Know about Health.” T 
1. Public health, Т 
the reliability of rad 
venereal diseases—eig: 
2. First aid. Here t 
what to do if you ha 
the skin or clotting, 


you 
В с at Y 
Огу consists of 69 items entitled “Wh 

he subdivisions are: | lum area 
his section asks for the definition of s health 0 
io advertising, and the effects upon 

ht questions altogether. el fai 
he test inquires about what to do if you fee. 


lize acid spilt 
ve a turned ankle, how to neutralize 
etc.—seven items, 


5, 


f 


nt, 


покй 
sswa 

43. After sending for a physician the first thing to do fora person who ha: 

poison is to 


1. Give him artificial respiration 


2. Make him vomit 
3. Go to the druggist for an antidote 43 
4. Put him to bed 
5. Give him a Strong laxative, stion?’ 
3 Prevention of disease (15 items). This Section includes que why 
about the Pasteurization of milk, what a communicable disease 
milk turns Sour, and how best to control smallpox and diphther 
60. Measles is most contagious е 
Before the rash appears 
2. When the Tash is most Noticeable 
3. When the skin begins to Peel 60 
4. After the skin has Peeled 
5. When the rash is disappearing wht 
a 
4. Proper health habits (19 items). Here are such quest о 
breathing through the Nose is best for health, what the correct а is b^ 
sleep is for high sch ol boys an girls And what type of bat 
when you are tired and neryo s ? 


MEASUREMENT OF PHYSICAL EDUCATION AND HEALTH 347 

Ee (18 items). This section raises such questions as to the foods 

Yentio contain the most minerals, the main food value of meat, the pre- 
р M of constipation, and how well pork should be cooked. 

ifue ental hygiene (nine items). T his deals with such problems as the 

dnd E of worry on health, the relation between facing life squarely 

Eo health, as well as the relation between poise and emotional 

me reliability of the test as a whole is .86. 

E Таз into eight different parts, as is reco: 

› Опе wonders what the reliability of each p 


t 
this Eh does help to tell at a glance just wher 
:15 Is followed by an item analysis of the weak part, real diagnosis of 
norms are based on returns from 2,415 


i à 
дешн may be attained. The : 
a pus in the city of Los Angeles and are reported in both percentile 
Em descriptive words such as very low, low, average, high, and 

igh. 
мее for both the Gates-Strang 

u er Health Inventory for High 
Q) d with pupils or students thei 
i E co uencing the teaching procedures of the teac 

urses of study. 


LIST OF TESTS OF HEAL 


When one breaks down the 
mmended in making a pro- 
art might be. The profile, 
e the student is weak. If 


Health Knowledge Tests and the 
School Students would be in (1) 
ү weaknesses in health information, 
hers, and (3) improv- 


TH EDUCATION 
Gallien and Hilda Schwehn. 


IG 
Тесу, 1tes-Strz ledge Shelby І { 
The Erades s ena ae ele State High School Testing Service, Pur- 
M forms each. level. Elementary due University, Lafayette, Ind. 
Vang, ades te ars inde ай ОНО Ыш Test: Knowl- 
te : Puis d Application, grades 7-16. 
Utes, 6515, grades 7-12, 30-35 min- edge an EA ae a 
Stray Authors: i edge 047. Form A. Time: 40-45 min- 
ne ч je * P dra utes. Authors: Clifford L. Brownell, 
N achers Colle z c Г : U wu John H. Shaw, and Maurice Troyer. 
кешуш "Acorn Publishing Company, Rockville 


M York, 


Stug Health Inventory for High School 


Center, N.Y. 
do tice Inventory, grades 
tions “tS grades 9-12. 1942. Two edi- h Practic , 


. Healt! t 
y 1943. One form. Nontimed (15-29 
1 Ned B. Johns. Stan- 


Auth, Ontimed inutes). 
hor. ned (about 60 minu 15 у. Authors 
чы ; а “a ^ in Un pots Press, Stanford Univer- 
10-14 Yrd ^ uds) ity, Calif. 
Gant trn p ere 3 Т. Trusler-Arnett Due Vd x 
Byra minute OH IT Е. Test grades 9-16. Forms а 3 
fid ч Беат а aes Оа Time: 50-55 minutes. ae a 2 

Niversit one a Trusler, C. E- Arnett, an E 
Test Health = d i Education Schrammel. Bureau of Educational 
Ing? State Hi Safety E for Measurements, Kansas State Teachers 
ада, hi ере College, Emporia, Kan. 

Fitness Index, boys 


a 
Aes ers igh school, first and secoP 
and yy 71946-1947, 1945-1946. Forms 
` Time: 40-45 minutes. Authors: 


8. Indiana Motor 


and men, grades 10-16. 1943. 60 tests. 


348 


Time: 50 minutes. Authors: Karl W. 
Bookwalter and Carolyn W. Book- 
walter. Bureau of Cooperative Research 
and Field Service, School of Education, 
Indiana University, Bloomington, Ind. 

9. Health Awareness Test, grades 
4-8. 1937. One form. Time: 30-40 
minutes. Authors: Raymond Franzen, 
Mayhew Derryberry, and William A. 


PROBLEMS OF MEASUREMENT 


McCall. Bureau of Publication’ 
Teachers College, Columbia University, 
New York. 937- 

10. Health Test, grades 3-8. 1 40 
1938. Two iorms. Nontimed (about 3 
minutes). Authors: Robert K. Spur ars 
Samuel Smith. Acorn Publishing Со! 
pany, Rockville Center, N.Y. 


TESTS OF INFORMATION IN PHYSICAL EDUCATION 


In recent years more attention has been given to tests of information 
in physical education. Playing regulations, game situations, and kno 
edge of positions and tactics have offered materials for constructing 
objective tests. Information tests for basketball, baseball, soccer, ар 
tennis have been constructed. In most cases these tests have not recht б 
the publication stage. They most usually appear in the research quarter 
lies of the National Physical Education Association. 


RATING SCALES 
Rating scales in ph 
several areas. Attentio 


other sports, for basketball riding competition, an se 

Teachi SUMMARY 4 
chi k 
A great wu da ìn physica] education have been clearly defi 
певень ei А tests and ratings indicate clearly whether ог 2°" ү. to 

tests of pied ене reached "These Instruments have been divi t 
Tests of ph iie а, tests of health and tests of achieve™ ttle 
UE 8 Кере aà capacity measure those traits which have ha ta 
куш quand. aiming. Pulse rate and blood pressure, lung cap "ical 
e Tegan sur dinatfon are samples of test of P ү ged 
carefully constructed ally stanG2T ye- 

д е 
аа ево estate, and have satisfactory reliability. P tho? 
t Sın physical education have be dardized for 20016 * sor 

33 different activities. Not only h en standardize 


ishe 
e ave T-s 011150“, pt 
these numerous activit Score norms been f t eig” 


les Ae 
› but each activity has T-score norms @ 


MEASUREMENT OF PHYSICAL EDUCATION AND HEALTH 


“кез levels of physical capacity 
it D oras norms, worked out in t 
Мазы, subjects, are quite satisfactory. 
ейре ee y of health information is 
е, d ( ) health practices. Tests o 
etena uch as are other tests of informa 
ve curricular research to disc 


Cour: " " 
ses of study. Because information а 
Í the subject to report what his practices 


So 
much upon the willingness 0 


are А 3 
› Objective standardized tests are 
QUESTIONS AND 


1. 

improv Under what conditions does 
ion e come in physical educa- 
certainty? the greatest degree of 

b 
M аё the objectives of 
agree with in physical education 
ication, the modern philosophy of 
capacit Distinguish between physical 

F уч physical ability. 
n the pel the principle in 
With th, cardiovascular tests. Illustrate 
> e Schneider Test. 
qo anpare fhe Michigan Pulse 
ali suis for Physical Fitness with the 
b. W Group Functional Test. 
teristics hat are the chief charac- 
а. Es the Rogers Strength Index? 
Те post hat is the best way to meas- 
Pot used ure? Why is this procedure 
more widely? 

d in 


instr 


volved 


Rate 


349 


which depend on height, weight, 
hree volumes and based on scores 


divided into (1) health knowl- 
f health knowledge are con- 
tion. Their items are based 
over items common to all good 
bout health practices depends 


difficult to construct in this area. 


EXERCISES 

5. a. What are three stunts used in 

Brace's Scale of Motor Ability? 

b. What modifications of the Brace 

scale were made in the Iowa revision? 
6. a. Describe the process used to 

classify boys and girls so as to measure 


achievement. 
b. What uses can be made of 


achievement tests? 
7. a. Describe the leading charac- 


teristics of the Gates-Strang Health 
Knowledge Test; the Health Inventory 


for High School Students. 
b. What characteristics of the 


latter recommend it for use? 

8. a. Why have rating scales been so 
successful in certain areas (e.£., diving) 
of physical education? 
b. Why are tests 


to construct? 


for sports so hard 


rove е the procedure use 
Boon BIBLIOGRAPHY 
S 
r i » Boys. New York: 
Bo Posture of Ninth Grade Boys. ? 
Co; VARD, Jou w. Bureau of Publications, Teachers Col- 
Ree and m. eoe Ко: lege; Columbia University, 1928. 
Educ and Measurements in Physica! Cozens, F. W., Налет J. CUBBERLEY, 
delphi 3d d Paene oL Phila- and N. P. NEILSEN: Achievement Scales 
1949 RI в, рр. Boe ny, 1 Physical Education Activities. New 
EN . Saunders Company York: A. S. Barnes and Company, 1937. 
Abi ACE, D Үү. om Martin A. TRIEB, and N. P. 
Com kw Ys Ке Шш и NznsEN: Physical Education Achieve- 
Betty, 1927 ork: A. S. Barnes ^ ment Scales for Boys in Secondary 
Тор OWNELL Guten Lee: А Scale Schools. New York: A. S. Barnes and 
ази, : i z, 1936. 
asuring the Anterior-Posteri?r Company, 


350 PROBLEMS OF 
.: Physical Fit- 
Советом, THOMAS K P з 
mess Appraisal and Guidance. St. Louis: 
The C. V. Mosby Company, Medical 
ishers, 1947. 
Hv CHARLES H.: Tests and 
Measurements in Health and Physical 
Education. New York: Appleton-Cen- 
tury-Crofts, Inc., 1942, 


: Measurement of Athletic Power, 
New York: А. S. Barnes and Company, 
1932. 

MOREHOUSE, LAWRENCE 
AucusrUs T. MILLER, Jr.: Physiology 
of Exercise. St. Louis: The C. V. Mosby 
Company, Medical Publishers, 1948. 


NATIONAL COLLEGIATE ATHLETIC As- 


SOCIATION: The Official Swimming Guide, 
“Official Rules for Swimming, Fancy 
Diving and Water Polo,” pp. 164-189. 


New York: A. S. Barnes and Company, 
1947. 


Nesen, N. P., FREDERICK W. Coz- 
ENS: Achievement Scales in Physical Edu- 


ew York: A. S. Barnes 
and Company, 1939, 

Roczns, FREDERICK Rann: Physical 
Capacity Tests in the Admini: 


stration of 
Physical Education, New York: Bureau 


of Publications, Teachers College, Co- 
lumbia University, 1925, 

: Physical Cap 
: A. S. Barnes and 


E. and 


» and EsTHER 
FRENCH: Better ing through Test. 
ing. New York: A, s. Barnes and Com- 
pany, 1945. 


Articles 


Booxwatrrr, К. 


y ARE WEA (Critical 
Evaluation of So 


me of the Existing 


MEASUREMENT 


ical 
Means of Classifying Boys A the 
Education,” Research Quari і Physi- 
American Association for Hea i. 
cal Education, and Recreatio 
10:119-127. vp. 
Cozens, FREDERICK We nd 
Education Measurement, ма. М 
of Educalional Research, PP company 
New York: The Macmillan 
1941. Е 
Советом, THOMAS i as 
LEONARD LARSON: “Strength Supple 
Approach to Physical wr Ж Ameri- 
ment to Research Quarterly af Physic 
can Association for Healt "194 12: 
Education, and Recreation E 
~ йй! 
а H. D.: “Ап nan i 
the Testing of Ability and л of te 
Basketball" Research Quar рт. ocio- 
American Physical Educatio E 
lion (1932) 3:159-171. | те 
Esrenscuape, ANNA: ‘ ae and 
of Motor Coordination in Йе 
Girls," Research Quarterly 5 
can Association for nt 
Education, and Recreation A 
53: struct 
Frencu, Еѕтнкк: “The ae Pe 
of Knowledge Tests in NE 2 ee 
fessional Courses in Phys Amen 
{ pn 
Association for Health, Physica 


and 
JRo an 


a tio?" 
Srnaxc, Ruta: “Health ne! sut 
Encyclopedia of Educationa маст! 
рр. 561-571. New York: The 
Company, 1941. 


PARTTWO 


Measurement of Intelligence 


354 MEASUREMENT OF INTELLIGENCE 


x А " è : sts 
function measured. Sometimes these experimentings did hit upon te 
which have later proved useful. 


INTEREST IN THE FEEBLEMINDED 


The second stream having to do with the measurement of inteligenca 
flowed out of France. Consideration for the education of the deaf "i 
blind and above all for the feebleminded originated in France. Т at 
the education of this latter group under the leadership of Ségu!n 
had its greatest influence. Séguin had instructed a small class of е 
feebleminded at the Bicétre and had shown that they had impro" e 
greatly. This work of Séguin stimulated Alfred Binet (1857-1911). ter 
early became interested in the problem of intelligence testing and 1e 1 
of separating the feebleminded Ro 
struggles to secure satisfactory са 
osely in time the attempts of a A 
з ests were first published in 190 ' was 
Were revised in 1908 and again in 1911. Collaborating with Binet 
5 were called the Binet-Simon best 
he Binet-Simon tests which influe па] 
testing in the United States. The 9^... 
е first from Dr. Henry Goddard, and) 
oblems of the feebleminded at Vine: 


environment. ergone some radical change in he? 

It was Terman who went to - 
energy, intelligence, and enthusias 
Stanford Revision of the Binet-Sim 
this revision constructed that it Ъ 
the United States and remaineq їп 


pm 
rk on this problem with pis? Us 
m that he was able to PU? ijy wi 
on tests in 1916. So виссев5 tes 
есате the leading individu? norm 
this position until 1937 whe” 


INTELLIGENCE AND ITS MEASUREMENT 355 


and Merri 
errill together published their own revision of the first Stanford 


Revision. 
Wh 
к n are some of the characteri 
it to become a leader? In t 


Weakn 
итна. of the Goddard revision. 
or giving and scoring Were sometimes not as clear as they 


might 
Ed ae were tests always located at the right years. He studied 
ш ented with all the tests he could discover. By adding some 
finally «d m: and moving some up or down a year or two in age he 
Ux ede em to fit pretty well a design which he had in mind. 
od esign the median mental age should correspond with the 
Busted Meca е. age. While he never quite achieved this goal the 
at that ee more nearly reached it than any other test published 
baa а, He increased the number of tests from the original 54 to 
ary test e most useful new tests added by Terman was the vocabu- 
. Instructions for giving and scoring Were carefully set down. 


here 
were then six tests at each age from year III to year X. Above 
XII, six at year XIV, six at aver- 


year 
аре = m were eight tests at year 
Notion 5 пу and six at superior а ult. There was introduced also the 
Spite Df the I.Q., which has proved to be а very practical device in 
evelopi the recent many misgivings about its use. The credit for 
ping the notion of the 1.0. is usually given to Wilhelm Stern. 
INDIVIDUAL TESTS OF INTELLIGENCE 
MENTAL АбЕ SCALES 


1 of the Stanford-Binet 
me weaknesses quite soon after it was 
s were SO difficult to score that equally 
Rudolph Pintner and staff 
t elaborate directions and 
were made which 
est anda 


stics of this Stanford Revision which 
he first place, Terman realized the 
Moreover, he found that the in- 


Revision 


T 

ани showed 50 
9mpet, lished in 1916. Some item 
at Tea, v people disagreed on the outcome. 
lust: ers College, for example, worked ou 
оша ndm for scoring this test. Lists of d 

Variet € acceptable for passing the 5 
е of drawings to aid in scoring the diamond and the 
9 test oi course, clearly evident that the test did not 
tests infants or high enough for the brightest, ап 
Qs at years 11 and 13. Then, (09 a curious thing W 
toc Of the very bright. Thes? ‘measures of intellectual а 
An LQ. of 140 at 12 years would be 


о Shri 
Ink as the child grew older. 

140. There was also only 016 form, which 

:ects had been coached 


fare 
s bd 25 at 15 years than ^77 
or ear drawback in the rare ca here subjects 
5 ere, for some other reason» a test needed to be repeated. To be 
› Herring had constructed a test which he claime 


356 MEASUREMENT OF INTELLIGENCE 


]lel 
an alternate form for the Stanford-Binet, but there was no truly am 
form in which the one form was made point for point like the ot S 

Finally, many psychologists and educators believed that the Stan 


The old principl 
from one year to t 


and reliability as Form L- tion of 
те exercised than in the ane us 
standardization. "The authors een jo? 
5 ^ m this test to represent the total poP es 
of the United States. To do this, they de pupils from H ay 50 
areas of our country. Not oP ocio 
Same proportions of the various * ose 


reater са, 


INTELLIGENCE AND 118 MEASUREMENT 357 


occupations there are 30.6 
Ө 1 .6 per cent of employed men i 
ees ыс апа Merrill secured 31.4 nx cent of EA rcd em 
Шейле dus their sample. Never could they get enough children from 
idum duds vu and so they had to allow for this weakness by standard- 
ёш АЙ үө T es an average LQ. of 102 at each age so that the repre- 
Whateva LS the total population might be 100. 
diga нама eps sea test has, it must have reliability. This 1937 revision 
reliabilit xcellent reliability. If we say a test should have at least a 
y of .90, then this is better than that, for its reliability is repre- 
enough the higher I.Q.s pull 


Se: 
edi, а coefficient of .93. Curiously 
cliability down. With feebleminded children the reliability соећ- 


cient i А : f 
is .98, while with very bright ones this coeflicient is only .89.1 


T i 
he following samples are taken from the Terman-Merrill Revision 
ar II and extends through 


of the St : 

the 7 Stanford-Binet. The test begins at ye 

rise ir perior adult level. By inspecting samples at 3-year intervals, the 
n the level of difficulty can be more easily sensed. ' 


Year VI 
1. Defines five out of 10 or more such words as orange, straw 
gown, roar. 
an Copies from m 
eo square and r 
4 Discovers what par 
5 Can count 3,9,5, an 
dit. Can discriminate betwee 
ifferent. 
6. Can trace two of three rather simple maze patterns. 


emory & bead chain of seven beads which are 
ound. 
ts of mutil 


d 7 blocks correctly. 
n drawings that are rather obviously 


ated pictures are missing. 


Year IX 
М. Can draw lines to represe 
"ply folded. 
3 Can detect simple verbal absurdities. 
men Can draw Greek key P2 nd trunc 
PR after having seen them for 10 seconds. 
with Can give rhymes such as the name of a color that rhymes 
h “head.” 
cn Makes change mentally wh 
6 10 cents to buy 4 cents wort 
orde Repeats in reverse order four d 
r. 


nt creases and a cut-out in a paper 


ated cone pattern from 


en he is supposedly sent to a store 


h of candy. 


igits arranged in haphazard 


t 
Terman, Lewis M., and Maude A. Merrill, Measuring Intelligence, p. 46. Boston: 
9 by permission. 


оц, 
Rhton Mifin Company, 1 37. Items 


8 MEASUREMENT OF INTELLIGENCE 
35 
Year XII 


in in- 
1. Defines correctly 14 words out of a list of 45 arranged i 
creasing difficulty. 


i ts that 
2. Detects verbal absurdities such as the one that asser 


: be 
in an old graveyard in Spain there was found a skull шш to 
that of Christopher Columbus when he was 10 years old. essenget 

3. Explains what has happened in a picture in which re 
boy who has broken his bicycle is hailing a passing mo 
4. Repeats five digits reversed. | 
5. Defines abstract words such as constant and charity. 
6. Completes sentences with words omitted. 


t yea! 
When these sets of items ате accurately located at the eui 
they may be used as points of reference. Thus a child who dh 108 
items of year VI is Solving 6-year-old problems. A chil test- 
capacity may, then, be derived directly from the scores on a m 
number of years and months he Scores may be thought of as 
age. 


ntal 


The 
ental 


Menial Age 


4 of # 
By means of mental age, it is possible to compare а DH ild. 
chronological age with the mental performance of the average he 
а consequence, it is possible to sa 


е Terman-Merrill Revision. у mont? 
The first child has a chronological age of 14 years and 1 
usually written 14-11). Here is his record: 
Years Months 
VII (basal age) 84 
VIII 6 
IX 2 
X 4 
XI 2 
"Total 98 


M.A. — 8 years апа 2 months (8-2) 


Т0. = C 100 = 57 


asse 
ed as th stems are РЁ” ge. 
at age on the test where all item ical 
ly begun at a year under a child's chronolo£ 


Basal age is defin 
Testing is frequent 


INTELLIGENCE AND 115 MEASUREMENT 35 
9 


down the scale until ай items at one age 


1 all items are missed. 
logical age of 8-6. His test record 


T 
s E tester usually then proceeds 
eem and up the scale unti 
e second child has a chrono 


follows: 

Years Months 

VIII (basal age) 96 

IX 8 

x 8 

XI 6 
Total 118 

М.А. = 9-10 


19. = (2) 100 — 116 


I H 
ntelligence Quotient (I.Q-) 


The j 4 
Ded intelligence quotient, 
vn etween chronological ag 

two cases just described, it 


led the I.Q., expresses the 


ordinarily cal 
as been indicated 


e and mental age. As h 
may be written 


10. = мА x 100 


In 
the first child this becomes 


8-2 
82) Jl es ot 


In 
the second it becomes 


(2) 100 = 116 


the intelligence which an indi- 
t us consider the accompanying 
11.0.5. It is derived from the 
and is recommended by Dr. 


indicates both 
te of growth. Le 
terpreting al 


The ; 

vi ‘avian emm quotient 

table possesses and his ra 
‚ Which is of aid in in 


Studi 
ез of th Е : we 
етіл е Тегтап Merrill Revision 
IQ. 
140-169 Very superior 
120-139 Superior 


110-119 High average 
90-109 Normal or average 
80-89 Low average 
1 70-79 Borderline defective 
Башт, Maud A., “1.0.5 00 e Revised Stanford-Binet Scale," Journal of 
ional Psychology (1938) 29.641-651- 


360 MEASUREMENT OF INTELLIGENCE 


TT. ory of 
From this table, the I.Q. of 57 places our first child in the vm ad 
mentally defective while the second child's I.Q. of 116 places 
Es s, for 
К id IQ. dien indicates something about the rate of pue pn 
illustration, consider three L.Q.s: 50, 100, and 150. The EE Tt takes 
of the child of 50 Т.О. is about half that of the normal c i Я ави 
such a child 2 chronological years to grow 1 mental Eu і T0. gi 10 
age he has grown only 4 menta] years. The child with t he А det 
grows 1 mental year during 1 chronological year. When he is ? rowing 
his mental age is also 8. The third child, with the I.Q. of 150, pis when 
at an accelerated rate. By the time he is 4 his mental age is ors wil 
he arrives at 8 his mental age is 12. Moreover, these three chi Tis 20 
continue to grow at somewhat the same rate as they have grown. 
then gives us some indication of the rate of growth to be expected. 


n 
ee : } in mind whe 
Four characteristics of intelligence tests need to be keptin mi 

attempting to understand them: 


1. LQ.s are not inhe 
life, the results of i 


education differed 24 
constitution 
lation to d 


who earns an І.О. of 90 has more r 
same I, 


cnelié 
L.Q. points.! Each of the pair had the €: fimu- 


th the 


irth 


1 Newman, Horatio H., Ми 


pledaY ^ 
iple Human Births. New York: Dou 
Company, Inc., 1940, 


INTELLIGENCE AND ITS MEASUREMENT 361 


Intelli : 
gence quotients remain within certain definable limits from year 


t 

ve but the limits are broad. 

P" I Q. І.О. is more valuable the nearer in time it has been computed. 
-Q. computed for a 3-year-old is of very little value at age 6. The 


testin 
g of very small children is fraught with many difficulties, e.g 
s more nearly the same, P 


Degativi 

го After year 6 or 7, the I.Q. stay 

LQ, н is less. For a sixth-grade teacher to have to depend on an 

puted ar in the third grade is unfortunate indeed. If it has been com- 
4. eere the child was in the fifth grade it is valuable. 

intellige igence tests, except for performance tests, measure verbal 

Penalized Б This means that a poor reader in the fourth grade will be 

requentl y giving him a group intelligence test. Poor reading then is 
If the y the cause of low scores on group intelligence tests. 

are the se matters are kept m mind, intelligence test scores and I.Q.s 

indie cee useful types of information which can be collected. They 

one: the child’s present learning capacity and help the teacher in 

ing what procedures are best for his continuing development. 


W Evaluation of the Terman-M errill Revision 
xt of this latest test constructed after the Binet style—does it 
up above other tests? It does. Most workers believe it the best 


test f 
of its kind ever constructed. It will undoubtedly be used more than 


a 
ee individual tests, and yet there are those who believe that 
an] improvement will come from other directions. They say, for 
age able, * the new Stanford Revision is probably the last of the mental 
er nal because its standardization is “laborious, rigid, and final.”* 
візі, ег criticism in the same direction 15 voiced by clinical psycholo- 
^ who deal with individual cases frequently nervous in disposition. 
p. orker. thinks the scoring y points 1s less cumbersome, that the 
nd of the Terman-Merrill Revision i5 inconvenient and that the 45- 
b: Vocabulary test is dreadfully jnadeq „She believes, further- 
©, that to ask subjects to define words orally is an imposition 1n that 
J tisfied only with 


e 
. est subje E nswer 
jects will not т М silent. Then, too, the procedure 


Iction 
ary definiti e 
efinitions and : eS ck un s in all at that age 


er 
ley, Je the subject is carri c h 
is a bad feature. It is bad because 

d to some 


e 
ч eras forward until he misses a 
Vou usually ends with 2 half dozen fail € 
1 ' . * er 
1р 5 subjects this series of failures 15 5 he 
E 106. Boston: 


Pan, 
2 


orture. 
rev. e Houghton Mifflin Com- 
Yearbook (Oscar K. 


Tei 
етап, F, N., Mental Tests, 
ts Yearbook, 


Y, 1939, 
Buros, ed Grace H., The Nineteen Forly Bs 
9405 е1), Item 1420. Highland 18 kNJ 


ental М casurements 
: The Mental Measuremen 


362 MEASUREMENT OF INTELLIGENCE 


- used 
On the other hand it is the opinion of one large clinic bm rior 
this test on more than a thousand cases! that the new г eliminates 
statistically in every way to the old test (1916 mi ay et as they 
many objections of the old, it tests the brighter more e ep kes 25 to 30 
grow older. But it also has its weaknesses. The newer test ta is still 100 
per cent more time to give than the Stanford-Binet. p and 252 
much emphasis on verbal material, especially in years le, and there 
Many tests are misplaced for New York children, for example, istration. 
is need in the case of clinical work for more flexibility of € cist 
Finally, the critics mention a weakness in basal age. The d to the 
age at which all the tests are passed. In scoring there are a es above 
score of the basal age additional mental months scored in the ages 01 
the basal age. In the new test there may frequently be two basa e of 10 
even at times three. For example, a child who is tested at the "— test? 
passes all the tests at year X, misses one at year XI, and gets es an 
right at year XII, thus both year X and year XII are the Ron ope 
the M.A. will be different depending on which one is used. tudent? 
investigator found, when 67 freshmen and 86 senior medical of basal 
were tested with the new revision, that 


ages for the freshmen was 1.5 and for th 


y 4 
e medical students, 2; ше 
cent of the freshmen 


had more than one base, and 56 per m age? 
medical students likewise, The reason for this multiplicity of ba small 
is that the mental-age growth from one year to the next is а i 
amount indeed at years XIV, XV, etc. Year XIII seemed to 
difficult for these c 


pt 

enters inextricably into the upper levels s yu e 
gence and to be able to think abstractly demands, in most cases ng n 
These authors undoubtedly would reply to the criticism concer”! прег 
small sample of words in the vocabulary test, that if works and, pular 
more, that this test is not intended to test ян individual's am ual 
but through his vocabulary to get an indication of his level of inte”. to 
development. The multipli 


: E cale: 
* Krugman, M., “Some Impressions of the Revised Stanford-Binet 9 
Nineteen Forty Mental Measurements Yearbook, op. cit., Item 1420. 


INTELLIGENCE AND ITS MEASUREMENT 363 


' conce 
LN ош age after the year 15 ог 16 is passed. It is a well-known 
| утеп 1 ао between mental years decreases after age 12 or 
Тел growth, in brief, slows down. The answer to the question as 
ceases entirely has varied from 14 to 25 years. The experience 


gained i ied ; 
of E in the First World War indicated that the average age 
maturity is 14. Terman in the original Stanford-Binet used 
{ 15 is used. This means 


16. T 
"n 3 ү Terman-Merrill Revision the age o 
e used 16, the denominator of the intelligence quotient for any 


Subje 
ject 16 years, 17 years, or 25 years old would always be 16. The con- 


cept 
1 T irem age, then, has only hypothetical meaning after age 15 or 
quotient m for example, substitutes for the LQ. an efficiency 
refer а "Er 15 guthor claims that an intelligence quotient must always 
effectivene ject's score to the mean of his age.) This fact limits the 
after the zm of the Terman-Merrill Revision for measuring intelligence 
in the P, 3 af 15 or 16. A second weakness consists of large variations 
or Ven deviations at various chronological ages. 
Year 6, t > ard deviations on Form L, for example, vary from 12.5 at 
LQ. of 115 0 at year 12, and 20.6 at year 234. This means that a child's 
Merrill a 2.5 at year б would correspond to 120 at year 12. Terman and 
6 at det these yearly differences and use à standard deviation of 
affected ages. Variations in I.Q.s from year to year would thus be 

by the very manner in which the tests are constructed.” 


Pornt SCALES 


vue Intelligence Scale 
me the Wechsler-Bellevue 
coring the point scale of 


In The Wechsler-Belle 

M there was published for the first ti 

ler gence Scale. This scale resembles in $ 
fe et al. 

is individual scale is suita 


- It is particularly well sui 


Serio 

ч 

east challenge to all other tests 0 
re the major part of what is contained in this definition: * Tntelli- 


Sence į 
€ is the aggregate or global capacity of the individual to act purpose- 
„4 


“у Д 
У, 10 think rationally and 10 deal effectively with his environment. 
les very closely that of the group tests of 


. its 
Inte] general form resemble 
and one substitute test, 2 test of 
3d ed., P- 


cts who are 10 years of age and 


dults. This scale offers a 
t claims to 


ble for subje 
ted for testing à 
f adult intelligence. I 


elli 
gence. There are 10 test forms 


dult Intelligence, 46. Baltimore: 


X ү. 

The net, David, Measurement of А 

* Term ams & Wilkins Company i 

1, у aay and Merrill, o. йр": 

Але t Y test for children has now been const 
А үе years. ч 
Шат, sler, David, Measurement of Adult Intelligence, 

з & Wilkins Company, 1944 


ructed which extends the testing 


р. 3. Baltimore: The 


364 MEASUREMENT OF INTELLIGENCE 


nt 
Parts: (1) picture arrangement, 


may 
from ordinary experience, Some idea of the value of each subtest 
had from Table 10. 


IN QUESTION 


(Ages 20-34, N = 355) 1 
1. Verbal 2. Performance "T P 
Тмогтафоп, — Picture SETEHIEEMIgU seit tr P 
Comprehension. өй Picture НЕЕ Аена a 
Digit span DE designs p ege one e 
Arithmetic Object PM esr eR M 
Similarities Digit Symbol. easi рыз ре 
(Vocabulary) bi 
The subtest «s closest relation with the rm й 
tion of the othe bject assembly the least Inte 
15 the high relations 


test 4° 
{ etween Vocabulary and the to 
whole, Terman valued vocab 


In spite of Wechsler’s 


Test, th 


in tb 

ET ж 

he use of the time facto ; 
These are arithmetic r 


Criticism of t on ti 
© Scores of five tests are dependent ii desig?" 


“asoning, picture arrangement, bloc 
and digit symbol, 


ht 
" гоиб j 
5 scored and the sum of the correct items is D 11) 


ple? ^ 
Раве 1 of the test blank (Таро yp 


е, ( 
YP under (1) verbal score; jc 
ore. Tables are furnished by w " 
eg ols 
hree divisions. is the BO b 
is test are reported in tofa 
he test is established firs 


The validity and reliability o 
Adult Intelligence. The validity of t 


365 


INTELLIGENCE AND ITS MEASUREMENT 


(240tutitogr "&upduuo?) su 
OMDJOJ BY} әцибоэә фаш 


ТОТ 551 sos Tini 
| [€] OT oz 39s Wod 
| €£) Ol NL mvos луянзл 


7550; o»uguropiog jnoj JO uoAIb өзе 
3450} |eqioA xis 10 jnoj ji Áiessooou 51 uoneioig, 


hhl 3YOOS IVLOL 
OL | soos зомуучаочиза 
29 | JONAS LIONA 
ЕХ ATANASSY 103Г90 


Мәѕза 32018 
GS) | $1 |  woumuwoo 


Gl B 61 |  ININISNYYYY 'd 


338025 1VR3A 
y (АЧУТПЯУООЛ) 
S3IL Ii V IAIS 
ЕТЕТ 
Nvds шета 
NOISN3H3UdAOO 
NOILVWYOANI 
1531 


AYVYWANS 


HAL pu suniy: јо uorsstusog Eg) 
uo aamoy 7504025 MOI Əjoudosddo oui бицээџџоэ 


Aq o: op Anu ojqoj o^oqo 94} vo ,udo;Bousísd,, 


"P91994j sny} оло ayy uot soso: 


эз gseiqns say} jo Aj 
D ADIp оў ум Ou SUDI 


0 £0 9-5 z 0 1 

Iı |x jz |o Je |o |а [о |: 

z [ие [в fate |i lee ley 

t 91-21 | 6 St E! EZ | vs È t 

D 6191 |11-01| 2-9 9 + eL L4 Ld 

s Е2-02 | 21 |01-8 s 11-01] ss 

2 L 9 LAM 

L 8 | sz josi 

8 & b [2874] 

5 o |o! izoz 

O1 |ar| at [асос n | n |szzz 

| arse] el | yz-ez erzi |9z-sz 

1 25-67 | 12-02 | 27-52 | zi y! | 82-27 

t| jess} zz |ecez| e | gt |ы 

vl [19-8 | ez |ztot| vi |z| се 

S! | 99-29 | oz vttt| si | si sese 

91 | 9-9 | sz |1csc ы | веле 

| 9t | se oz | ov-6e 

81 ret co ze 

- a = 

HESESESEPIEPIEBESESEFIESESN 

881. |$ |S £2là2|£ | E 18 | 2| 3| 3 | Fs 

3S D] Ra 9 |; Фе; е E з л 8 > з = 
1312 | & zs 35 3|£|8 oe ЯЫ 

Ea У а 3 9° c 3 2 3 

S Sa |7 Ы 5 < 
е. = э е 
E E & 
= = 
Е 2 
а | 3400s Mv 5 


1539005 ОЗІНӨНЗМ JO 318V1 


Of азоү TVAGIAIGNT NY хоя LST], 30ATTT3g-WAISHOSA| IO LATHS TAOIS ‘Тү яту, 


366 MEASUREMENT OF INTELLIGENCE 


correlations were computed 
“commitment” or “попсо 
achieved, the re 


Stanford-Binet I.Q.s and psychiatri 


sts’ recommendations....... +++ б 
Wechsler-Bellevue L.Qs and Psychiatrists’ recommendations... ...-- a 


The evidence as 


оез are 

В as CaSUreS much the same sort of thing 45 ^ рк 
Terman-Merrilt Revision апа correlates with roup tests 200 
other individual tests, $ 
The report on 


he 
the reliability is not all desired’ s 
P 8 th ld be : 
reliability is computed from the repetition of Re om test at inter fot 
of 1 month to 1 year, Tt is true that the reliability coefficient 0! "д i 
both children and adults is adequate, but the number of cases Ч” 43 
definitely inadequate Only 32 c ildre 


! Manual, p. 134. 


INTELLIGENCE AND ITS MEASUREMENT 367 


was carefully done. The population used 
16, fr 70 children between the ages o 
inj bie idm 100 at each age, and 1,081 adults between the a T 
нира = were from 50 to 195 adults at each age group, with a 
40. The secu ore at each group from 17 to 40 and fewer than 100 after 
B titres Mey of samples truly representative of the total population 
Боло T (o Noting that in general there is а significant correlation 
eur tae echsler-Bellevue and the level of educational advance- 
the рер ү , comparisons were made with these levels as achieved by 
fae ion of the United States. There is some tendency for the 
example poe population to be better educated than the average. For 
Сасна 10 рег cent of the Wechsler-Bellevue group were college 
ires d while the average for the nation is 2.93, the corresponding 
Боран г illiterates were 2.55 and 4.69. In the Wechsler-Bellevue 
Чер m. 19.68 per cent are high school graduates ог above, while in 
ma r ie et at large this percentage is 13.86. Moreover, 10 per cent 
only, whil e Wechsler-Bellevue group are elementary school graduates 
mentan ile 13 per cent more of the general population did some ele- 
con y school work but did not graduate. The population sample was 
posed of whites only, and therefore the test is not recommended for 


Use ў 
n ; : 
measuring subjects of other races. 


н Er determination of norms 
is procedure consisted of 6 


ve Features of the Wechsler-Bellevue Scale 


ale abolishes the use of the mental age 
he M.A. is only a score, and (b) 


ain age (usually 15 or 16). The 


test, 


Distincti 


К (Ше Wechsler-Bellevue sc 
hat x the LQ. Tt is held (a) that t 
natur range is limited beyond a cert: 
€ of the I.Q. is changed somewhat. In this 


10. = attained or actual score 
Q = expected mean score for age 


ndividual’s achieved score and the mean of 


dividual belongs. It gives an individual's 


ПИ 
5 à E 
e ш a ratio between an 1 
ige group to which the in 
ge group For these reasons the І.О. keeps 


th 


e 
A meaning throughout life. . | 
Units TI з a point scale w tted into standard score 
ет his is not so distinctive 4s mig firs r. The Terman- 
orm dere do calculated the standard deviation of 16 to be used with 
the mea. An L.Q. of 116 in this latter test 18 1 
ean, 
With X makes allowance for the gradu 
ge. An illustration of this occurs when a sco 


al deterioration of intelligence 
re of 70 is considered. 


368 MEASUREMENT OF INTELLIGENCE 


. ing to the 
A score of 70 on the full scale gives the following I.Q.s according 
age: 


Age LQ. 
20-24 80 
25-29 83 
30-34 86 
35-39 89 
40-44 91 
45-49 93 
50-54 95 
55-59 97 


:ntelli- 
€ to know immediately in which € wer is 
rbal 

5. It putation of the I.Q. based either on cated 

tests, on performance tests, and on both together. For poorly € 

adults the Т.О. based on performance tests is of very great val evt 

. The evidence аза Whole clearly indicates that the Wechsler-Be 

is the best instrument available for testing adult intelligence- 


scores makes it possib] 
gence the individual is Weak or Strong and to construct a pro 
desired. 


allows for the com: 


PERFORMANCE Tests D 


h f ds and E 
er heavil euis of wor! 
other verba] problems М р 


o a 7 j i 
On the one hand we have a Sentence with words omitted, on the ^ jew 
appear pictures with certai 


tO pal 
“in parts omitted. The other poin r 
regards the performance test as distinctly supplementary to | in£ 2" 
test and as Supplying a phase of planning and problem solv to do 
encountered in the first Instance E 


INTELLIGENCE AND 115 MEASUREMENT 360 


the picture, a keen observation of what was present in a previous pic- 
ture, a recognition of what is not now there, and finally the selection of 
е correct picture. In many of the performance tests there is need of 
boh observation, and then of analysis and selection. In a simple form 

oard the subject must perceive the size and shape of the opening and 
then select out of many that block which fits the opening exactly. 
Again, he must actually thread his way through a pencil maze whose 
imaginary walls cannot be crossed, but along whose imaginary road the 
Subject must move his pencil to an imagined goal. Do such procedures 
involve the same sort of intelligence that is present in answering verbal 
Problems? It is impossible to tell by introspective analysis. The only 
Teal way to solve our problem is through the aid of the coefficient of 


Correlation. 
Even when we use this coefficient we cannot be certain of the answer. 
he reason for this is that many correlations between performance tests 
and verbal tests have not reckoned on the C. A., which is related to both. 


ме compute the coefficient of correlation. between the Pintner- 
aterson series and Stanford-Binet we geta correlation in the neighbor- 
hood of .80. Based on this figure we could say that 64 per cent of the 
Variance in the verbal test was associated with the variability of the 
Performance test.! But we have failed to consider the fact that both 
tests are correlated highly with age- The r between C.A. and Stanford- 
inet М.А. is close to .90; and between Pintner-Paterson and С.А. 
about 75. When the factor of age is “partialed out” (made constant) 
e€ true correlation between these two tests is reduced to .43 and their 


Percentage of dependent variance is reduced to 18. If this line of argu- 
tudents who believe the performance tests 


es are correct. Another bit of evidence 
ts into this pattern When the new -Merrill Revision was being 

constructed the authors were ve xious to do away with that con- 
Inuipg criticism that the Stanfor pended entirely too ШШ 

Upon language facility. They tried out severa ance ipie 
hat consciously in mind, but to no avail. These authors could fin ew 

vels of intelligence 


Perfor iddle and upper le 
. Fmance t hich at the m! e ] 
“Atisfied the dee laid down for the construction of the test as à whole. 


The Pintner-Paterson Scale of Performance Tests 


These tests differ sharply from the Binet tests (1) in requiring actual 
anipulation of material to solve the problem, and (2) in not leaning 
Š oe masen he variance of the dependent variable which is 


This 72 gi 
associ. 5 7? gives the percentage le iable. This interpretation is “а more general 
Я tage of elements їп one 


ated wi 1 va 

result with the independent rs Dh ceni 

than i В : 2 as giving t e per ые. 

ha WEE em ad "p por o Garrett, Henry E., yo es in ee! 
nd Education ^ : pe гой York: Longmans, Green & Co., Inc., 5 

, р. 355. 


370 MEASUREMENT OF INTELLIGENCE 


too heavily upon the relations between words. Most of these tests de 
mand observation, memory, and manipulation for their solution. | di 
Pintner-Paterson scales consist of 15 different tests that are e 
separately and scored Separately. Seven of these tests consist of 50 


ae Fer sangle 
type of form board: Séguin, two-figure, five-figure, Casuist, ex. RE 


The 


test, diagonal, and Healey Puzzle A (Fig. 29). These tests deman 


f 


Fic. 29, Pintner-p. vibe C 
i -F atersi yaa f 
Stoelting Company, Chicago mance Test, short scale. (By permission 9 
relation of each 
three tests—Ma 


re ё 
S to the whole problem. The om 


ip T я Picture 7. а 
репа more ips P fest, and Healey а 


У 5 est; 
€ manipulation of the parts. In the ship tea 


I ^ 
* ате placed in an irregular е ship mn 
prrectly, a complete picture of mani" 
classify, although t | situ um 
the same grasping of the tota techd 


INTELLIGENCE AND 115 MEASUREMENT 371 


poe ое had already been decided upon and arranged in a key at 
Four о $ e test. The cube imitation test consists of five 1-inch cubes. 
fave af ер аге placed ona table and the fifth one is used to tap the 
мня e others іп a definite pattern. The subject watches closely, 
mes р and taps out the same pattern that the experimenter 
ies enonse en The patterns then become more complex. The 
‘isles 2 the adaptation board, consists of a board with four round 
7 ce ree of these are 6.8 centimeters in diameter and the fourth, 
bm imeters. One block fits exactly into the large hole (a fact 
diff strated to the child). The whole board is then placed in four 

erent positions. The child puts the block each time into the hole 


i2 it fits exactly. 
or more convenient testing these 15 tests have been reduced to 10 
t, Healey Puzzle А, substitu- 


b "n 
y omitting the triangle test, diagonal tes 
board.! By reducing the size of some of the 
carried in a small case. 


= 
Eu test, and adaptation 
m boards the whole test may be conveniently 
be preferred to the long one, both from 


posten scale is probably to 
T enience and because of fewer form boards. 
iria age range of this test is from 4 to 15 years. There are no tests 
а are discriminative over this total range. Two or three form boards 

€ of little value after the M.A. of 10, and the feature profile has no 


value until after 10. 


Arthur's Point Scale of Performance Tests 

u This arrangement of tests is composed of two forms. Form I is made 
ng of eight tests from the just-mentioned Pintner-Paterson series, 
erer with the Kobs Block Design Test and the Porteus Mazes. 
tm II is made up of Healey Picture Completion II, the Porteus 
a azes, and the Kohs Block Design e tests selected 
tha the Pintner-Paterson series. These tests were restandardized on 
€ basis of records secured from 1,100 school children, ages 6 to 16.2 

? Scale 
s no apparatus and 
me. Its instructions 
{ а man. Make the 
elation to artistic 


“Drawing а Man’ 
Ж еге is another performance test which require: 
Are ally not more than 10 minutes of the subject’s ti 
ve Straightforward and simple: «Маке a picture 0 

ty best picture that you can." The score bears no г 


d Rudolph Pintner, 
rt Scale. Bureau of P 


Goodenough 


tions for Pininer- 


Manual of Direc 
Teachers College, 


in 
s Hldreth, Gertrude, an udi 
olumbj Performance Tests, Sho 4 ublications, 
2 bia University, New york, 1937. 
1 Bol Tests. New 
i йш, Grace, A Point Бий yf Performance Тез 


» Division of Publication. 


York: Commonwealth 


372 MEASUREMENT OF INTELLIGENCE 


5 
rears and wor 
points in all. The test covers the range from 3 to 13 years à 


e 
"tA hat thes 
best between the ages of 4 and 10. Tables are furnished so t The 


4,000 


h 5 н i 
se his clothing was more pant w 
oints were selected for scoring 


suc 
e 
orman? 
n 
uccessive ages, (2) a clear des. The po! 
e same age but in different school grades. 


THE MEANING OF INTELLIGENCE 


| ement 
he present time any general t the 
intelligence. Tt seems to the АШЫШ coit 
ained in one aspect of Binet $ a agi a 
(1) the ability to take and maintai 0 


e 
urpo” rpe 
У to make adaptations for the PUP?’ rj, 


o 0 

aning espoused by many psyc” p 

Some years ago (1921) à number of psychologists were t il 

express their individual Opinions as to what each thought 1? j 
was.! here were Over 20 re 


ition 45 е 
: these definitions, Colvin’s a f не 
15 simply апо{ ег Way of saying adapta Р 
purpose of attaining the de 


Я B ever 
one away with, since hardly 


ember, gave us the 1. 
genera] Capacity of ап indivi 


st 
adj" 
О new requirements,” “Iti 


"Hl 
е pil 
thinking t S à general mental ada?/4 

© 
1 "Intelligence and Its Meas 


urement ” 
Psychology (1921) 12:123-147, 1 


(Symposium), Journal 
95-216, р 


INTELLIGENCE AND ITS MEASUREMENT 373 
? A few other definitions much like 


new problems and conditions of life. 
“He has to see the point of the 


this one will be given. Woodworth says; 
uem now set him, and to adapt what he has learned to the novel 
ini ы Wells's definition approaches very closely these others: 
Ne irr ie means precisely the property of so recombining our 
E poeni nei as to act better in novel situations." Of course there 
ima of adaptation. If an individual adapts well he has more 
igence than if he adapts poorly. 

A Another group of eminent psychologists places a slightly different 
mphasis upon what intelligence is, and yet the author believes all of 


E can be subsumed under one caption—degrees of adaptation. 
horndike, for example, defines intelligence as intellect, “аз the power 
{ view of truth or fact." The emphasis 


оа responses from the point o of tru 
“i | 1s upon the sagacity with which an individual adapts. He has more 
ima in proportion as he selects the responses poorly or well. Bal- 
e ’s definition is similar to the one given above: “The relative general 
Geen of minds measured under similar conditions of knowledge, 

erest, and habituation.” General efficiency for what? For making 


sdequate adaptations to new situations. Not greatly different is Pint- 
t's definition: “Ме must remember that intelligence is merely an 
evaluation of the efficiency of а reaction or group of reactions under 
Specific circumstances.” But what are the bases of evaluation if they 
ае not the adaptation to а situation? If the situation is well adapted to, 
We give a high value to it, if not, 2 low one. Finally, let us look at Е. №. 


"Teeman's definition: «Psychologically, degrees of intelligence seem to 
which the subject matter of experience can 


Сереп on the facility with v а 
E 9rganized into new patterns. This rearrangement of thought material 
А What characterizes particularly the higher mental processes. The 
ganization of subject matter of experience into new patterns is most 
чашу adaptation at а higher level. An individual meets a problem 
‘ich is complex and involved. He brings to bear his past experience, 
Justs and arranges it, selects from it those facts which help him meet 

р € present problem, and in this manner d oblem. 2 pro- 
on as he adapts well, he is intelligent. MS E: nition 
first reading does not fit into the concept Aoc » "This 


efi E 

Lass intelligence as the 
rr ae is probably mean 
at т. Ence. As a matter of fact 
Se 


the ae > 
Pick-and-shovel level invo use, at this level, an 


ative le : ore because; j 2 

st зе you i е < Te dis doctor's directions, or à builder can con- 

s carry о > 

is ЧФ а house in T lans furnished him. The reall Я 
rom the рга k abstractly- He can plan you & 


Non, 
€ of these. He can think 


y intelligent man 
house, in- 


74 MEASUREMENT OF INTELLIGENCE 
3 


bolism- 
tical sym rse 
i develop mathema the nu 
a preventive serum, or à doces T i: 
E Vise ч man. The reason that the аа one: b. 
pie i hey can meet si а 
Por igent, however, is because they а sch willed 
- е озу ihey can meet new situationsin a ed сеу {то m 
7 e i i ly. The scientist, › com) 
isfactorily they act intelligently. jv to & 4d 
т em the nurse in that he adapts анка Ф capacity 
em c If one wants to restrict intelligence dtes them 2 к 
он that attribute of many situations which veram te to inte It 
although different on the surface, and to use that tac to do We 
other situations, ż.e., to think abstractly, he is 2 intelligence 25 be 
seems a trifle restricted in conception to think of i than suc 
capacity to adapt by thinking abstractly. Surely this is 


edly 
B undoU elli- 
ful of all adaptations and those who can think v d d that int "i 
do possess a very high form of intelligence. It is subm 
gence in its very essen 


vere И 
ists ! 
се as used by competent Ca беш eT. 
great majority of cases is adaptation to meet a desired x the indivi pe 
It is clear that adaptation might refer to changes it there WO ate: 
tion remained static, As a resu 


x 


clim jle 
at or corn may be adapted to a us rund Mo 
On the other hand, the adaptation might be in the Y айв ut. 
the individual remains the same Neither of these wm which We 
exists. Generally speaking, there is a problem to be so dh a ар ge 
out of a situation or a field of forces. To solve such a pro lan 0 ch pas 
may be made of its component materials. However, the r^ way be Л 
must have been evolved by the individual so that in The succ? ол. 
adapted conditions around him to Solve the problem. fa ta of 
Solution of the problem would be evidence of his power 0 Я 
Intelligence then varies in i 


e рё 
Was 8, 12, 18, and 27 inches or ate 
h intelligence 15 shown by an officer’s succes 3 еб 
of a problem їп logistics which he has never met before. An р 
telligent such ап one is regard 
Procedure which 


INTELLIGENCE AND ITS MEASUREMENT 375 


Standardi ; є 

M o УО test. He also was the first to introduce the 

i oom нЕ of mental age. The 1908 revision of the Binet-Simon 

оын nslated by Goddard, adapted to American children, and 

the United upon a large number of them. Four revisions developed in 
ed States: (1) the Stanford Revision, (2) the Kuhlmann Re- 


Vision , é 
ard A Point Scale for Measuring Mental Ability, and (4) the 
n. evision of the Binet Scales. Each of these scales has its 
Se Lr The Terman-Merrill Revision of the Stanford-Binet is the 
children. nt of these and probably the most satisfactory of all for testing 

Th 

e Wechsler-Bellevue Intelligence Test, first published in 1939 
? 


Iesemb E 
les in general form а group test of intelligence in that each sub- 
its scores are in points but continues the 


test 
CORB АЕ 
ontains similar items andit 
the expression of a rela- 


use of 
the I.Q. However, the I.Q. of this test is 
core and the average score of his age group. 


ti 
on between an individual's $ 
Scores derived у influence of language оп the 
Bice testa T from the Binet tests construction of perform- 
Ein fn t was seen that the claims of these tests rested on the propo- 
Views m not all of intelligence 15 made up of verbal relations. Two 
verb 3 introduced: (1) that per e coordinate with 
" um tests, that they are another procedure 
e mental traits; (2) that performance tes 


апе] 
у ie adding a necessary and neglected part 
verbal tests. The Pintner-Paterson Scale of Performance Tests, 


Arthur’ 
; tthur’s Point Scale of Performance Tests, and the Goodenough “Draw- 


in 
pone Scale were described. 
g with the development with the instruments of measurement 
i understanding more adequately the very 
d by competent men 


] definitions formulate 
е discussion. The notion of adaptability has 
ich contains the elements of many 


c of intelligence. 


Ns AND EXER 
rize the leading changes in- 


d ees 

Ests Was the two types of inter- 5. Summa L 

ch led to the construction of the troduced by Terman in the Stanford 
Revision; by Terman and Merrill in 


fist tc 


5ts of intelli 
telligence. 
their 1937 revision. 


ide of a page the 


Seth. «X lai А 
catback SW the events which caused à 
Astruc: the early interest in test 6. Place on one side 
favorable facts concerning the Terman- 
i i on the other the 


tion in th а 
tei ho e United States. , С 
ts i first introduced the Binet Merrill Revision and о 
this з the United eye What was b . Which seem to you 
Aun Cas ogist's major interest to carry m 
thors z and evaluate the P 
ncept of intelligence- 


ading characteris- 


resent E ae 
rill Revision 


376 


with those of the Wechsler-Bellevue. 
8. How does the Wechsler-Bellevue 
test provide for the gradual decrease of 
intelligence with age? 
9. a. Distinguish between a per- 
formance test and a verbal test. 


BIBLIOGRAPHY 


Books 


Artuur, GRACE: A Point Scale of the 
Performance Tests, 24 ed. New York: 
Commonwealth Fund, Division of Pub- 
lication, 1943, 

FREEMAN, FRANK N.: Mental Tests, 
rev. ed. Boston: Houghton Mifin Com- 
pany, 1939, 

GoopENovort, Bess T б FOSTER, 
and M. J. Van WAGENEN: The Minne- 
sota Pre-school Tests, Minneapolis: Edu- 
cational Test Bureau, 1932, 

GoopENovon, FLORENCE L.: 


Measurement of Intelligence by Drawi ngs. 
Yonkers, N.Y; 


World Book Com an 
1926, dna 


~: Mental 


The 


Testing: Its History, 
Principles, and Applications, New York: 
Rinehart and Company, 1949. 

HERRING, Joun P.: Herring 
of the Binet-Simon Tests, Ex 
Manual, Form A. Yonkers, 
Book Company, 1931. 

HILDRETH, GERTRUDE, and RUDOLPH 
PINTNER: Manual of Directions for 
Pintner-Paterson 


Short Scale 


Revision 
amination 
N.Y.: World 


Yearbook, 1941. 
KUHLMANN, Е: Tests of Mental 
Development. Міппеа, 


Polis, Minn.: Edu- 
cational Test Bureau, 1939, 


PETERSON, ]О$ЕРН: Early Concep- 


tions and Tests of Intelligence, Yonkers 
N.Y.: World Book Company, 1995, ' 


Pinter, RUDOLPH: Intelligence Test- 
ing, Methods and Results. New York: 
Henry Holt and Company, Inc., 1934. 


MEASUREMENT OF INTELLIGENCE 


Я es of 

b. Describe the main рео 

the Pintner-Paterson Scale o 

sis. int Scale 

mt. Why has the Arthur Boin d de 

of Performance Tests Бе ж 
most useful of the performa: 


т, q and 
се Tes a 


” 


House 


PonrEUs, S. D.: е 
Mental Differences. eee 
Smith Printing and Publis 
1933. M 

SrEAnMAN, CARL: The Sillan 
Man. New York: The Macr К 

ЖЬ ‘east! 
4k c5 RACHEL: Mental Men esi 
ment of Pre-school Children. 1931. | 
АУ: World Book Companys (og | 

Terman, Lewis M., Л "i 
Merritt: Measuring Inte Danis a 
ton, Houghton Mifilin Con Phe Me 


Abilities af 


р ы ork 

THORNDIKE, EDWARD eu f alk 
urement of е аа А 
Bureau of Publications, 


Tniversity, 1926-, ,. 0l 
lege, Columbia University; y en 

ThunsrONE, L. L.: uer 
Abilities, Psychometrika 2 
1938. 

WECHSLER, DAVID: Balt 
Adult Intelligence, 3d ed. aai 
Williams & Wilkins C oman 

Wetiman, Beru L.: Lhe 


0) 

il 
Measurcne phe 
È imore, 4. 


11150 


Т 
of Pre-school Children as Mea for 15 
Merrill-Palmer Scale of аге, V^ New 
Tests, Studies in Child We se idi 


00169). vay 
No. 3 (University of Iowa 2o 10 
Series, No. 361). Univers! 
1938. 


NÉ 
yr 
SBE gs” 
YERKES, ROBERT M. p or po 
Curtis Foster: А Pain et P. 
wing Mental Ability. Balti"! jog 
Wick and York Incorpora 


Articles 


Gv 
ERT 2 
BERNREUTER, ROBES stud? jos 
CHARLES Н. GoopMAN: 
the Thurstone iar А е 
Tests Applied to Fres nm alio? 
Students," Journal of Educ 
chology (1941) 32:55-60. 


INTELLIGENCE AND 


Forrest, Rura: A Study of the Prog- 
nostic Value of the Merrill-Palmer Scale 
of Mental Tests and the Minnesota Pre- 
school Scale, unpublished master's thesis, 
University of Pittsburgh, 1939. 

Intelligence and Its Measurement” 
(symposium), Journal of Educational 
ae (1921) 12:123-147, 195- 


MacMurray, Dowarp: “А Com- 
Parison of the Intelligence of Gifted 
Children and of Dull-normal Children 
Measured by the Pintner-Paterson 
Saale; as against the Stanford-Binet 
Nn Journal of Psychology (1937) 
:273-280. 

MERRILL, Maun A.: "The Signifi- 
cance of 1.Q.’s of the Revised Stanford- 


ITS MEASUREMENT 377 
Binet Scales,” Journal of Educational 
Psychology (1938) 29:641-651. 

Mircuett, MILDRED B.: “The Re- 

vised Stanford-Binet for University 
Students,” Journal of Educational Re- 
search (1943) 36:507-511. 
: “Irregularities of University 
Students on the Revised Stanford- 
Binet,” Journal of Educational Psy- 
chology (1941) 32:513-522. 

SHARPE, S. E.: “Individual Psychol- 
ogy. A Study in Psychological Method,” 
American Journal Psychology (1898- 
1899) 10:329-391. 

WISSLER, CLARK: “The Correlation 
of Mental and Physical Tests,” Psy- 
chological Monographs (1901) Vol. 3, 


No. 6. 


CHAPTER 15 


Group Tests of Intelligence 


THE DEVELOPMENT OF GROUP TESTS 
However succe: 
intelligence tests 


intelligence test which brought about this condition. d because they 
Group tests of intelligence were slow in being develope 

were opposed by psychologis 

great war to develop and p 

undeniable that 


id that it tO) js 
ts. Some authors have дй Jligence д 
opularize group tests of inte 0 


group tests may no 
It seemed to psychologist. 
individual test. Tn the first gativ? g^ 
certainly to thei se 0 p 
scattering of attention, or lack of self-confidence. In the case © tes 
tivism, the skillful test. 


-соайаёе in, 
improve his lack of self i t uP 
-You are doing fine, d for € NT. 
' are exhortations fre uently use гере, q's 

Р q ightly oF T^ р 
agement. Then, too, the directions could be modified slightly “ie chi 


1. 
Я this manner, а child s сї ее 
stantly motivated so that he did not attack one problem W ost d ith 
ness and alacrity ang another with gloom. One of the ™ Р 


ge 
ke the test, in which case there ^ 4 m 
to do but to try another time 


А oun j 
1 - Finally, many еуен : ot 
testing situation an unusual °PPortunity for observing 
378 


GROUP TESTS OF INTELLIGENCE 
379 


reactions and i 
E work habits of th: j i 
ньне 1 e subject. Ratings w 
аво es Eae to cooperate, his M eb p 
usness, and his ability to keep hi | аа 
а Bete h y to keep his attention hi 
: ave been provided f bol. de By 
Informatio P ed for the purpose of collecti M ee 
A n about the i j есип 
беа personality adjustment to th i 1 onal 
| y all th i i ашынан 
тесш hese facts enter into the interpretation of оа 
How, th 
, then, can the gr : 
techni 1 group test compete at all with the indivi 
foren a Го, ea test weathered the ыга 
à orked. After the age of 6 or 7 the limi 1 RU 
Drevi à g or 7 the limitations of 
ined mentioned do not seem to affect the score a ved e "iR 
Еси y speaking, elementary school, high school, and СОЛЫ 
Eu willing to take the test. Certainty of understandin 
student «à stating the problem, illustrating it, and then having the 
Di vi ry a fore-exercise himself. The skillful tester watches closely 
to the es of attention and when it occurs immediately steps quietly 
ason] m encourages him to work on the test or warns him that А 
а Pa a little time left. There is even a slight advantage residing in 
p test when some self-conscious children are tested. Some of 


them 
become more self-conscious when a tester asks them oral questions 
with others they lose them- 


1 
CEA don they are sitting in a class 
Pupils ref e group and really make a better showing. However, some 
Scores ve EL to cooperate and score Very low or zero. Any child who 
an indi Yd ow or zero must be tested with another group test or with 
vidual test. One new difficulty presents itself in group testing— 


Th : 
е final clincher in this argument came when the reliability and 
be satisfactory. 


RMY ВЕТА 
he exigencies of army 
scovered that many 
anted an 


Tur ARMY ALPHA AND А 


Th 
ied group test gre 
n the First World War the army offic 


Dsiy, 
il, io procedure of trying them out in situation 
me companies would be found ready to procee 
in their efficiency. It 


5 of the same regiment would be far behind in 
‘es and regiments who 


Jl-balanced comp 
n their mastery © 
ht young men for o 


8i 

Were omit to obtain we 

Just ag ‘arly at the same level i 
Important to select brig 


MEASUREMENT OF INTELLIGENCE 
380 


А test 
which а 
her types of training. These were some of the uses to E 
ot id bey ut. But what of the test itself? | — ui s 
M асе of psychologists charged with the eria so that А 
s i ing in difficulty—easy e ed 
t widely varying in difficu y de dem 
test wanted a tes A d difficult enough to c jy an 
А uld score something, and diffic edd 
“a ae eas They needed, too, an instrument w roa a and 
Le eri scored, would not take too long a time to 


a they 
x ven 
ent cheating when a test was erie "these 
thought that there should b iv . 


nfor 
n of Stat 
thoughts in mind they discovered that Arthur S. Otis, ра intelligenc à 
University, had begun the Construction of a group tes 


Ds = 
tely made available to the рс Ч А Otis 
were discovered and along wi 

at was known as a 
» Was finally revised, and reduced 


s st, 
"his te 
1 t test? 


Attention span 

Test 2, Arithmetic reasoning 
Test 3, Practical judgment 
Test 4, Same—opposite 


Test 5, Disarranged Sentences 
Test 6, Number 
Test 7. 
Test 8, 


ge 
t the 
"- r that y 
А$ group tests of intelligence are investigated it will be clea : се 
into their constr intellige of 


and 


GROUP TESTS OF INTELLIGENCE 381 


diffe ; М " 
"mà : a list of paired numbers, discovering the parts of pictures 
Smaller pais А "P E a pencil dividing up larger areas into which 
ы i the total test isa summation of the number of items 
of the bud test, with one exception. In Test 5 the score is one-third 
that instru right. As the test was used more and more it was discovered 
gesture ея demonstrated on the blackboard and explained in 
e test did pantomime were open to considerable variation in giving. 
tevision : | not prove as reliable or as valid as Alpha. There is a modern 
"i | hich uses only oral directions. The test was а forerunner of 
master sts of intelligence for little children which do not require a 
with md of the written language. The Beta test, just as was the case 
е performance tests, threw new light on the intelligence of those 


Who 
have language handicaps. 


Tur PiNTNER GENERAL ABILITY TESTS 
he series of tests called the 
s. There are four different 
am Primary Test, to be 
de 2, (2) the Pintner- 
st half of grade 2 


p test is t 
rbal Serie: 
Cunningh 


Pis capital illustration of the grou 
tests in General Ability Tests: Ve 
Used fy, this series: (1) the Pintner- 
сы р kindergarten through th 
Tough Elementary Test, to be Р 
Suitab] the first half of grade 4, (3) the Pintner 
Pint, e for last half of grade 4 and grades 5 through 8, and finally (4) 
Ore ег Advanced, which is much like Pintner Intermediate Test but is 
e advanced and suitable for Ч ith grade 9 through adult levels. 
ries rocedures used for the cons nd standardization of this 
мео 10845 may Бе taken as exa he available group tests of 
Fons - 30 shows a page from the Pintner-Cunnin. 


se w 
truction à 
mples of t 


gham Primary Test, 


the hroughout this series great effort has been exercised to select only 
ini : items and subtests that had already proved their worth as efficient 
No ators of intelligence. In the tests for the little children who had 
wa." ct learned to read or who as yet read rather haltingly, dependence 
ar Placed on pictures. For example, in the Pintner-Cunningham Pri- 
Were Test the children were asked to mark the pictures of objects that 
: alike in some way, or to mark what goes up in the air, or again to 
Pictu the Prettiest of three pictured faces. Figure 30 shows one of the 
res and the i е h pintner-Durost also uses pictures in 
idu m ns of which ideas can be registered. In 
ed which ex- 


"i P 

he ; icture mea 

е in, , Picture Content, by я t subtests are us 
hese are vocabulary, 


nt à io} 
Perie, Mediate and advanced tests eigh 
“Nee had shown were probably the best of all. T 


MEASUREMENT OF INTELLIGENCE 
382 


s 05165, 
7 ification, opP 
1 selection, number sequence, best answer, classification, E 
logical se ‚п r z ы 
s d arithmetic reasoni g. ted and the: ; 
M nis of each test has been carefully studied cud with t 
M i i been co ex 
i lations have be e of eve 
х for inspection. Corre were s 
published Ae f each of the four tests. In ted W 
-Binet in the case of each l 1 ms 
yn a is about .80. In addition each test is co 
tes 


m 
іпаћа 

H er-Cunnl 
many other evidences of intellectual progress. The Pintn 


А schoo 3 
ili cceed in 
test thus correlates well with measures of ability to su 


e 
dvanc 
5 2 iate and a 
and especially well with tests of reading. The intermedia 


TEST 1 


ks). c 
ss Sere EF i th 
Total number right ЧҮ за Бнын that 80 up 
Fic. 30. Pintner-Cunningham Primary Test. “Mark the things $ 
air,” (By Permission of World Book Company.) test 
tests have coe 


05 
u 

eliability based on scores of pne р Te 
from one grade varies from .83 to .89 but goes up to .94 w rade: Na 
are drawn from members of the kindergarten, the cin 2 one gr^ gin 
econd grade, a much wider range. Computations base curacy’ 

неч age conform to the strictest Canons of statistical ac 


GROUP TESTS OF INTELLIGENCE 383 


bility correlati i 
follows, elations, based on pupils of one age or on one grade, are as 


Ріпіпег-Сиппіпећат.............. .89 
Pintner-Durost 
I. Picture Content............. .85 
II. Reading Content............ .95 
Pintner, Intermediate. . . TE 
Pintner, Advanced................ .85 


Ba standardization of these tests was adequate. The populati 
E om the norms were established were studied for their representative, 
in dom normality. One of the cities whose children's scores were y р. 
cit e establishment of norms was shown to be an average Aue ME 
ani In the standardization of the intermediate test 100,000 ane с 
ildren representing both urban and rural populations were used. Ч 
iè There are several important features of this series of group intelligence 
mt In the first place, reference has already been made to the use of 
andard scores. On page 364 it was shown that the Wechsler-Bellevue 
‘ses this type of score in computing the I.Q. In like manner standard 
Scores are used with these tests to compute the го. 


TABLE 12. Use OF STANDARD Scores IN COMPUTING TOS 
ch age value. Intermediate and advanced 


(Standard score norms corresponding to ea! aterm 
B, regular edition.) 


tests, Forms A and 
Years 


15 | 16 | 17 | 18 | 19 | 20 


eS Ы 


Months 
| в [ie lod dis] 12.) 48 үш 
ке || а 7|191| 195 
о  [101| 113| 124| 134| 143] 150] 158 i n U js t 192 | 196 
т |102] 114 | 125| 135] 143 | 150 158 * 183| 188] 192] 196 


8 
2 — 103| 115] 126| 136] 144 | 15! 159 | 165] 172/17 
178 | 183 188 | 192 196 


5 | 104] 116] 127 136) fae = 166 174 | 179 | 184 188 193| 197 
105 | 117 | 127 | 137 | 145 n 161 | 167 | 173 179 | 184 | 189 193 197 


~ 106| 118| 128 | 138 | 146 
161 | 167 | 174 | 180 184| 189 | 193 | 197 
168 | 174 | 180 | 185 | 190 194 | 198 


146 | 154 
175 |180 | 185 190 | 194 | 198 


6 

107 | 119 | 129 | 138 

7 5| 162 
10 39 | 147 | 15 

8| 120] 130] 1 147) rss | 162 160 


8 — |109/121| 131|140 
zs | 181 | 186 | 190 | 194 | 198 


63 | 169 
9 ао 122] 131 | 141 | 148 | 156 i 170| 176| 181 | 186| 191 | 195| 199 
191 | 195| 199 


10 157 
111] 123] 132| 142] 149 181 | 186 

1 112| 123| 133 | 142 | 149 157 | 164| 170 176 bd 

: ministering and scoring the intermediate and 


Fri Р 
ady, OM Pintner's manual for ad 
Need test, 


384 MEASUREMENT OF INTELLIGENCE 


n? 
170. 
6 months 
core 
his 
score is more than t 
between what is no 
algebraically to 10 
tained score — nor 


“pred ВУ in 
S 12-6; his mental age (M.A.), палей m in ү 
ndard score for 170, is 14-10. His Т note is 
tefore, is 14-10/12-6, or 119. You wi T 


r 1 HT. al WAY: е 
15 3 points larger when derived in the usu e 
is the exact procedure fo 


"T 

чував ther е 

: T years 11 and 12, For the other pe and mad 

slight modifications in scoring which are already worked ov 

available in a table. sect 
re аш of the Pintner Verbal Series are: p 

. Int 
: takes MOT? еї 
ight on through, unless he 


ast twO 
ave not finished test one, go on to test У? 
2. A profile may be 
i -toa 
эу eight tests. This profile enables the вереште w 
Score int t divisions a immediately 
Strength and Weakness, nd to see im 


кы! Series in its selection of ums its st? stand 
5 its precision jn i made E 
ization based on a ih marek: 


e 
Е пау of 
is as 


e are 


Конімамқ 

Another example of 
gence Tests. This well- 
the present (1951) has 


:STS 
-ANDERSON INTELLIGENCE TEST 
a test s 
known 


had fiy 


jli- 

Int€ t 

š ётѕор * gd” 

eries is the Kuhlmann-And: 1921 pies 
Set of tests appeared first 3 


: с are n 
© revisions, Altogether ther А р. эй 
Ж ; ing the 7 ght ¢ 
1Tt is interesting to compare this Tes . using “оце 0 

$ ult edure ; I. 
gested by Terman and Merrill in 1 QA MALA p.42 acer an S go 
Mifflin Company, 1937). The Standard score there used is derived m^ he Те 
16, just the same as that used here for the median standard score- к ver. 
Merrill procedure a person whose LQ. is 116 is just one S.D. above 


GROUP TESTS OF INTELLIGENCE 385 


which were selected from 100 after preliminary trials. These tests 


àre arranged into nine batteries with ten tests in each battery. There are 


E" first-grade batteries, one for the first semester of the first grade and 
: e other for the second. Batteries are arranged for each of the grades 
rom grade 2 through grade 6; one for grades 7 and 8; and finally one 
extending from grades 9 to 12. Each battery is made by including a few 
= ie tests found more diflicult at the preceding level and adding suit- 

e new tests. In this manner the 39 tests are distributed into nine 


batteries. 


. The standardiza 
30,000 Minnesota children, repre 


tion of the tests has been carefully done. More than 
sentative of the general population, 


Were used to ascertain and check the median mental-age scores. More- 
Over, the original norms were based on at least 350 nonselected children 
at each age. One of the unique features of this series is that each test, 
made up of б to 24 items, is standardized separately. The M.A. then, is 
ra as the median M.A. of the 10 which the subject secures in each 
| attery. This arrangement whereby each test is standardized separately 
has elements of strength. In the first place, a new test can be added with 
Very little difficulty. One more M.A. may simply be put into the total 
Pool and the median computed. Moreover, since each subject earns 10 

age or standard deviation from 


different M.A.s one may compute an aver 
eir medians. In this manner the variabilities of different subjects may 


© compared. Or again, the profile of the individual’s scores received 
o discover whether the level of the test 


from the 10 tests may be used t 
ith the mental level of the 


thmetic reaso 
d low in 


Validity 

The authors? procedure to secure validity was certainly unique. 

А Ustomarily validity is secure by comparing à test’s score E 
en Other measurement of the same mental processes secured indep - 
ШУ. The degree of relation is indicated by the amount E corre нра 
v ch obtains between the two independent measures. In this case e 
alidity veni р үи? ted by the size of the coefficients computed 
бест the Па ann Anderson test and (1) Stanford-Binet D) vem 
hefo S, and (3) other group intelligence tests pc rect m : = 
аты © But these authors © jected to each peal aah nd end 
a Bued that (1) the S ford-Binet test is an individua es a Я i 
Score which : a d o compare with the group-test scores; (2) schoo 


MEASUREMENT OF INTELLIGENCE 
386 


d 
? whims 2n 
à mixture of intelligence, interest, and шеи ы 3) ott 
lia a ients would be ambiguous to say the leas 1 Ede i 
us Pare. used these just-criticized techniques fium prefer 
validity end hence cannot be depended upon. These au 


hich t^t ^ 
fna Н е test W in 
base their validity on the discriminative capacity of th 1 
define as “the ability to mak 


here 
crements of me 


"ma 
Reliability | reliability 5 
ere also opposed to -— er f the 
€ degree of reliability is A 


Kuhlmann and Anderson we 
the usual wa 


Coefficient of 


test, ^ ihe 
581ve givings of two forms кз whe nec 
€ even-odd technique Whereby the odd Scores are corre Ја have der 
evens and then an estimate is made as to what the 7 wou 
had the test been twice as ], 


nderson © үре 
а (зее раре 29). But Kuhlmann and А 
ave none of these. The 


due als? 
ir 5 Н те fa 
Variations in scores which we : Г 


t 
i ел} 
ect апа not in the test would lead, they E est WP віл 
Interpretation for the di erences would appear to be in t 
Was really in the subject. Т i 


uhlmann-An 
СУ argue, the eff. 
test to another i; at а minimum, B 
has not been determine. There is 
test as reliable as Possible und 


among the subjects and a 
there were variations in Scores f. CET 
а reduction in Correlation, we could know the part which 
the subject beyond the normal Played. 


:. Min? 
1 Instruction manual, p. 8. Educational Test Bureau, Minneapolis; 


GROUP TESTS OF INTELLIGENCE 387 


The difficulty wi i 
y with these simpler i i 
E validity arises in the fact wn ambae y deem Aem ко 
© know how reliable a test i it i ‘able. We Уй 
p а test is, not whether it is reliable. We К 
реше we begin doing any calculation. If the reliability t Ше 
in 65 D from a representative age group is .85 and that of het de 
a. ca e second may definitely be used for individual diagnosis while 
еч may not be so used. Furthermore, if a group test correlates 70 
dos peii. Lew and .55 with school marks, it is definitely to be 
: o one W relation is .50 wi i 

with E ose correlation is .50 with Stanford-Binet and .40 
dw weakness in standardization appears. There is only one 
pend wo forms of a test are of real use when (1) it is suspected that the 
5 a been spoiled, or (2) it is wished to prevent cheating by stagger- 

he tests, or (3) it is desired to have an unusually reliable score by 


No those of the two forms. 

бозы all these shortcomings Kuhlmann-Anderson tests have been 
ple, = and satisfactorily used. One competent user of tests, for exam- 
siders A that he “has used the tests with entire satisfaction and con- 
Public m the most outstanding group scale available for use in the 
reflect schools."! One great advantage of this scale is that it does not 
eviden as much as some other group tes 
not d ce, one may mention that four of the 10 tests used for grade 5 are 

ependent either upon reading or upon other verbal relations. 


PRIMARY MENTAL ABILITIES 


І Я 2 
that m development of explanations of intelligence Spearman showed 
i if the tetrad equations came out zero there were two components of 


Inte]; A : i 
lligence, factors g and s. As studies continued in this area of intelli- 
many cases of correlations among several 


е: Н 
уз became clearer that in 1 1 
acto e conditions of the two-factor theory were not satisfied. Other 
each rs appeared whose clusters of tests correlated more closely with 
grou other than with factor g. Spearman then introduced four or five 
su P factors in addition to his factors 8 and s. These were thought of as 
Pplementary, 

Eng] € movement for factor analysis (led by such mi 
and, and Thurstone) approached the matter in 

‹ ted for by а sin 


UR 
Be intelligence could not be accoun 
‚Т & but needed several coor ors to accoun 
i ts. Among these American 

he theory and mathematics 


port of Thelma Gwinn 
(Oscar K. Buros, 


en as Thompson in 
a different way. To 
gle dominant fac- 
t for all the rela- 


tal Meast 


tgers University Press. 


m 
е), КОШУ , Austin H., The 1938 Men 
+104. New Brunswick, N-J-: Ru 


MEASUREMENT OF INTELLIGENCE 
388 


х re theo 

;hich tap these factors which wel cence is 

erii ae nice To the Se tut conss Е 

retically es wes which may be represented by an I.Q. te i con 

cae i ыл For each of these factors tests Ө will meas"? 

vene Ther is, for example, one test all of whose en tö memory 

Ee el з another containing only items mu d verbal test 

P. ill another made up of definitions of words, the t suitable 101. 
Nc these factors were presented in а practical tes 


0817 
P relate P 
d corre SUNT 
certain range of testing it was discovered that they ы nat as igh т, 
tively with each other. To be sure these ее Here is а tê 
in some other batteries, but they still were present. 


1 
correlations from the Examiners Manual of 1948. 


ІА Р Q Mo 
V 

P .60 

Q .67 .56 

Mo 4| „52 54 

S | 255 | 61 ‚56 ‚46 
2. 


У = Verbal mea 
Р = Perception 


О = Quantitative 
Мо = Motor 


ning 


d thre? 
in two 
H sin 
5 = Space ("ability to visualize and to think about objects а 
dimensions x twee 
However, it 


must be said 
the test factor. 


s decrease w 
tions are in the order of .3 
case. 


эй рё“, a^ 

in fairness that the inier dert mee 
ith age until at the college leve x re 

0 and not in the order of .50 as 1 т 

the угус 

Because there аге Several components of intelligence, 25 e 

Stones believe, the 
stigma of the low Т. 
may be shown him 


50814 
p. 7. Chicago: Science Research А 
? Tbid., p. 12. 


poy 

15 Scores are printed on the bane db б Јо? о 
sheet to aid him in Interpreting his own scores. Figure Johnny” pat 
Jones’s scores on five primary mental abilities. Note ^. Note t 
scores on V and P, both of which are related to san A н 
mental age may be computed by combining the camper manual fyg. 

1 Thurstone, Thelma Gwinn, and L. L. Thurstone, Examiner’s "os, 

SRA Primary Mental Abilities, 


GROUP TESTS OF INTELLIGENCE 389 

T s H * H 

E. 3 also claimed that differences in scores on these various factors 

e а great help for guidance. For example, there is a high correlation 

к н scores on verbal meaning and perception on the one hand and 
iness to read on the other. In like manner, the quantitative score 


gives a good idea of a child's possibilities in arithmetic. 


Namie pee etm 
© Sex p 
Date of Test иг? 2? — 4 


School - 2 

Gra Я ? ? 
rade йй —— — —— MEM d Birth Date » E? p 
AR Mo DAY 


Room M ee E cca Age 
ы v Mo DAY 


5 


3 
CERE 


Yeors 


ASE SCORES Month: 


———— 8- 
ккө 007 49 


gr, VS WH HP 
Cw 


mental abilities. 


al abilities when computed by 
0 students in grade 10B, are: 


Fro. 31. Johnny Jones's 


The reliabilities of the primary ment 
thod, using 50! 


th 
€ Spearman-Brown me 
Verbal meaning. 777 .92 
AM АН .96 
paces * 
Reasoning. esee 3 
Number..:777777 r^ 
Word fluens 7777 9 
e been substantiated 


Some of the claim ries of tests have * 
statistically, but they (PMA) have no had such wide u 
ne other tests that have been market longer. Here are the results 
of Investigations not before published." Relationships of each of the 
ave primary ental abilities MA) to marks on subjects studied in 
high school were computed- articular interest are the coefficients 
Computed between V (verbal meaning and marks in English. Consider 


se as many of 


RA Primary Mental Abilities of High School 


* Mood: - of S. 
М y, Caesar B. Analysts of в 
upils, doctoral e ion niversity of North Carolina, 1951. 


MEASUREMENT OF INTELLIGENCE 
390 a 
3 dai workin 

the test of verbal meaning takes only 4 gor ee ii 
the fact qus hieves coefficients between .50 and .72 bs more, this 
time and de faa civics, and United States history. ys mon duc 
D ыс» -76 with marks in general business and .66 v 
Sa 


s tial coeffi- 
-67), and typing (.65). Number Shows substantia tics. In 
cients with Unite Story, typing, and general хаашаа 
ТУ mental ability shows a closer relati 
€n the five are combined (Т). 


INTELLIGENCE TESTS FOR VARIOUS LEVELS 
KINDERGARTEN AND BEGINNING First GRADE 


Variation appears in ch 
nterested i 


run k n n th 1 in 
ilize 
hi the materials ш me 
акегѕ have do € their best to construct items in ^ 
anner as to ер attention on the 
ing of attention, a on u 


м гас” 
n dy effort. They have utilized і in 
described 9r to discover what was wrong or miss 
ore complicated q 
of objects to be co 
Probably not more 


: те 

rawin opied, and piu. А 

p No Written siens S озы Шу ы t 

an eginni be 18 

one sitting. The tester shoul ars ккан gets n : 

own test blank, that they all have the tight place b efore beginning; tio? 

they do not simply Copy from each Other, and that any child’s p > 

with a propensity to Wander be brought back immediately to the P’ pe 

lem at hand. More than at any other аве good results depend upor 
cleverness of the tester in anipulating the testing process so aS 

the best effort possible from е, Subject. 


et 


GROUP TESTS OF INTELLIGENCE 391 


At no othe i 
r age level is there a greater need f 
i O OH ‹ 1 or an accurate a; i 
dcs s intelligence than in the first grade. Such an ae eine 
онн y into any decision to begin the more formal work of Aeka 
€ (reading, numbers, writing, etc.). On the contrary, the Т 
5 


of the d е : 
the tests suitable for such testing are lower than those of the uppe 
r 


grades. 
kindergarten and beginning first grade are 


true good group tests for 
Pintner-Cunningham Primary Mental Test, revised, kindergarten 
Intelligence Tests, grade 1, first 


to Sr 
n grade 2; (2) Kuhlmann-Anderson 
mester; (3) Detroit Beginning First Grade Intelligence, revised 1935; 


e Goodenough Intelligence Test, kindergarten to grade 3; (5) California 
of Mental Maturity, kindergarten and grade 1, 1943 preprimary 
tery; (6) SRA Primary Mental Abilities, PMA, ages 5 to 7. 


GRADES 1 THROUGH a 


нае for tests of these school grades show а definite change from 
in а pictorial materials to the use of language and number. The 
hu E in the first instant 15 oral; the answer being registered ina pic- 
Ека ! n the second case written language 15 used both in the situation 
ж in the response. Let us take analogies as an illustration (Pintner- 
"hie Scale I-A). The situation 1$ given orally: “robin: worm." The 
ae er then must find in pictures the same relation. There are four pic- 
: а cat, a dog, а cat at a plano, and a mouse. Robin: worm = cat: 


m „ушн Е E 
(Ouse. When the relation 15 а written verbal one, the problem is, а 
— zero—temperature." The clock 


үр ше = thermometer: mercury : 
De time as the thermometer is to temperature. In the opposites test à 
4milar condition holds. For illustration the question is given orally 

» The answer is 


co ark the picture that means the opposite of asleep. 
ntained in three pictures: (1) a bed, (2) a child evidently asleep in bed, 


and (3) a child sitting up reading. 
fs other tests of this series ino i 
while in the other form, the problem and the solution are written. 

be ce forms found most satisfactory at this level are the ones that have 
= tried out on numerous occasions and have proved their worth. 
ye sites and analogies have already been mentioned. Arithmetic 
‘soning holds its own as а test form. Vocabulary tests, both oral and 

р са remain good. Among the younger children the oe Ua 
Si lon of drawings or the recognition of a drawing among others С osely 
milar have been used. Tests of classification deserve special mention. 
Sse tests demand that the subject see likeness between items appar- 


e a 
Ntly different and then mark out another item really different from the 
r recent tests especially suitable at this 


other 
| four. Altogether in the fou 4 à 
el there are Ji anon types of test forms which have been included 


ne form contains all answers in pic- 


392 


> о NH 


MEASUREMENT OF INTELLIGENCE 


- Mark the picture that means the opposite of straight. 
. Mark the picture that means the opposite of high. 

- Mark the picture that means the opposite of rough. 

. Mark the picture that means the opposite of push. 


a 


GROUP TESTS OF INTELLIGENCE 39. 

| 3 
In the batt 
^ > buttery (1) because they show a sharp rise in percentage passing 
ње. ч, age to the next higher one or from one grade to the пех hishes 

d - : 
[de ] d because they correlate substantially with the total isst. 
DW Ает age require the perception of relations to pass them 

g Se Ss а 

memory. y though a few simply require keen observation and 


TEST 4. OPPOSITES 


I . 
n each line mark the word that means just the opposite of the 


first word. 
А. black — dark light white night 
B. down — below high up 


l. fast — slow 


2. hard — soft 


mE ey эши SO 
* Strong — i 
o Oo O о 


5. young — antique 


* quiet — e = 5 miy 


lose discover 


Wini кт ту O О 


Pig. 33 

^ 33, Pintner-Durost Elementary Test, Test 4, opposites: reading content. 
3 are (1) the Pintner-Durost 
‚ grade 3, and first half 
f Mental Maturity, 


Tests (in separate 


1 through 


Som 
Ej... 16 good tests for grades 
last half of grade 2 


of p чагу Test,! suitable for : 
Bragas 4 (Figs. 32 and 33); (2) the California Test o 
book 1-3; (3) the Kuhlmann-Anderson Intelligence t 
Quick ts), grade 1 (second semester), grade 2, and grade 3; (4) the Otis 
coring Mental Ability Tests: Alpha Test, grades 1 to 4; and (5) 
T Primary Mental Abilities, PMA, ages 7 to 11. From the Cali- 

tems by permission of World Book Company; Yonkers, N.Y. 


8. Spacial Relationships . . 32 
6. Sensing Right and Left . 10 —— 
7. Manipulation of Areas . 12 —— 
8. Foresight in Spacial Sit'ns 10 —— 


| І | | Lot І ] І ] 
C. Reasoning . . . . . 48 5 1015 2025 30 35 40 45 50 55 60 


S Opposites. 2 i x жь „ 2 = p 
10. Similarities . oas . 12—— 
11. Analogies so з » w le 
12. Number Concepts . . 12 —— 


С. Non-Language Factors . 100 ___ 142030 40 50 60 — 70 75 в 


H. Chronological Age 


l. Actual Grade Placement 
(Grade pupil is in) 


1 \ I 1 1 ' i 


40 50 60 70 80 90 100 110 120 130 140 
Mental Age Yr. : - 


Mo. "M ÓÓÓÀM— ee ele A is 


48 60 372 84 96 108 120 132 14 156 168 


SUMMARY OF DATA Score М.А. + CA = LQ 


С. Non-Language Factors 


Fic. 34. Form for representing each subject's score, California Test of Mental Maturity. 


SONSOITISINI JO SISAL 4ПОЧО 


Soe 


TEST 


А. 


CALIFORNIA INTELLIGENCE TEST a 
NON-LANGUAGE SECTION—PRIMARY SERIES 
(This is the Non-Language Section of the California Test of Mental Maturity, Primary Series) 
Devised by Elizabeth T. Sullivan, Willis W. Clark, and Ernest W. Tiegs 
Мате e s oL ERU бы. P nm Grade. ..Boy-Girl 
School... ———M РЕТТЕР uA ge, m Last Вїгїһдау........................ 
Teacher — 9»... ee oomen р 7.7 ceo teeth 
TEST FACTOR сые рари 

1. Visual Acuity . . . . 10 0 Ij Z ITT „ре 

2. Auditory Acuity . „ . 10 Et 2 ASIDE c 

3. Motor Co-ordintaion . . 20 =l A B EXTUS cU 


DIAGNOSTIC PROFILE 
Mental Age (Chart Pupil's Scores Herc) 


48 60 72 8 96 108 120 132 144 156 168 


Possible Pupil's Yr. 


FACTOR Score Score Mo. 5 T 
40 50 60 70 80 990 100 110 120 130 140 
Memory . ..... Е : : А ' К ; ; | ' E 


A. \mmediate Recall . . . 20 — . . stun oi 


OE 


WONASITIGINI 40 INGIECGDHOSVIW 


396 MEASUREMENT OF INTELLIGENCE 


5 
fornia Test one may secure language and nonlanguage M.A.s аз er 
an М.А. based on the total test. Tt attempts to divide total intellig 
into (1) memory, (2) spatial relationships, 
vocabulary. Also a dichotomous classification is 
nonlanguage tests (Fig. 34). 


(3) reasoning, and bi 
made into language 


GRADES 4 THROUGH 8 


forms or subtests are made mor 
rials and by making all the cho} 
occurring most f 
lowed closely by 
metic reasoning. „їй 

_The giving of o 5 permits of almost infinite range 
difficulty. Sampl 


PPosites to word these 
aarp es of opposites Picked from tests suitable for 
grades are: 


Я er] 
2. get 3. keep 4. lost 5. lose [Pintn 
ia Word means the Opposite of humility? DS otisl 
“Joy 2. pride a dry 
Find both—tennis ну 4. funny 5. recklessness 


lesson nice reward 


son] 
[Kuhlmann-Ander?? 


T APP [california] 
341-4 1~1315_49 [Kuhlm ann-Anderso? 
In the other, the Sequence 


$—9—13—11 21 55. 


(a) 30 (уз (c) 27 — (d) 29 © ей 
tirio (a) 12 (0) 27 (gs (d) 18 ане 
Moe А logical Selection, classification analogies and arithme 
usually is, “What do ШАТ used. In logical selection the je jn 
another way, “What are these the olay have?” or, expres 


never without?” 
‘In this chapter Permission for th 


а from 
" А jve 1 
the World Book Company, Yonkers NY Е the Pintner items was rece! гой 
the California Test В i 


ureau, Los 


:4 items on 
Angeles 5 for that of the California Meander 
items, from the Educational Tes ез, Calif; for that of the Kuhlmann-: 


tins. $2 ut of tl 
"ureau, Minneapolis, Minn. 


GROUP TESTS OF INTELLIGENCE 397 


(3) banks (4) bridge (5) ferry 


River—(1) fishes (2) boats 
[Pintner] 


Sait 
quirrel—(1) nuts (2) fur (3) tail (4) cage (5) tree 

[two things—Kuhlmann-Anderson] 
hat respects four of 


ir : j . . . 
n classification the problem is to discover in w. 
out the one that is 


the ite е 

ек items are alike and опе is different and to cross 
like the others. 

(1) diz 

i diamond (2) gold б) ruby (4) iron (5) platinum [Pintner] 
Beneral XÍ ensign (3) major (4) colonel (5) captain 

[Kuhlmann-Anderson] 

n with us from the time of the first 

hly regarded. In analogies one dis- 

and then applies that discovered 


blem.! 


The test 

— test form of analogies has bee 

ies in group testing and is still hig 
ers a relation between two items 


relati s : 
ationship to the solution of the pro 
(2) motion (3) smoke (4) fire 


Body. р 
Чу: Food: Engine (1) wheels 
[Pintner] 


/ 2 fuel 
amp is to a light as (?) is to a breeze—(1) a fan (2) bright (3) a sailboat 
a window (5) blow [Otis] 


tion is arithmetic reasoning. 
al ability or of being too 
th the total test and 
t each increasing 


in test construc 
ms of being à speci 
s highly wi 
ge of subjects a 


It ee old war horse 
gr ee ee criticis 
nn. like school because it correlate 
Use it is passed by a larger егсепіа; 

age level, Р y a larger P 
numbers is 35, What is the other number? 
[Pintner] 


(d) 65 (e) 30. 
or the day. Pupils from your school won 60 
(04 Q3 (8 


ts did you lose? 
[California] 
[Kuhlm ann-Anderson] 


(а 
na) 185 (узо — (0256 
ре 114 meet, 20 events were listed f 
4) ы of the events. How many even 
hat i 
Ut is the number 14 of which is 54 of 18? 
exts most frequently used in 


It is ; " 
immedi arent that the six t 
pe f intelligence are for the most part ex- 
by the subject. Now 


St 
CP M ten E n „= facts well known 
"s _then аери because of à lack of i pn d the 
whe nal data, but this is not the rule. High scores are secured DY we 
: 9 are able to perceive relationships among words, numbers, or by 
teas, The other tests which are DO listed are for the most part de- 
Jations more or less 


Sign 7 
к i er те 
clearly 09 test the subjects' capacity to discov 
apparent. 
se from the World Book Company, 


р 
> гс. H 
Vong mission to use items from the Otis test cau 
ers, N.Y 


W 


308 MEASUREMENT OF INTELLIGENCE 


nswer 
The third group of test forms is composed of кши ier frst 
ical reasoning, substitution, memory, and similaritie E eee 
or к Revision of the Binet-Simon tests, Terman pla Aor. 
e upon his vocabulary test. He thought it as good pm be 
Hem peus test items. While not quite as high a position 
ES it today, it still is regarded as a useful test. 


refuse—(1) object (2) accept (3) delay 


intner] 
(4) reject (8) value [Pintn 
ballet—(1) feast (2) banquet 


(3) carnival (4) ball (5) jane апей 


liforni2. 
2 question З subdue 4 disguise m In 
И чи Ipha. 
The best-answer test, too, appeared in the original Army AP sible 
this test the subject selects the best answer out of three or four p 
answers. 


dispute—1 disturb 


“Drop by drop the lake is drained” 
(1) Every man wishes water for h 
(2) It is never too late to mend, 
(3) Drowning men will catch 
(4) All’s well that ends well, 
(5) Many little s 


means: 
is own well, 


at a straw . 


trokes fell great oaks, 


[ріне 
t 
Either the sun moves around the earth or the earth moves around the sun. BU 
Sun does not move around the earth Therefore 
(1) the earth Moves around the moon 
(2) the earth moves around the sun jiforni®] 
(3) the sun is larger than the earth, [pa 


Another test form that has age on its side is the substitution ue 
became popula "se it reflected easily and direct y 09 
results of learning. Since intelligence was in some quarters de ajon 
the his test fitted directly into that de 

may have a key such as 


A test of memo 
pairs, then givin, memb 5 
one. One reads first: “wind. our, sleep—bed, river— m. 
E , icture 
i { Wind" and asked to find one ror ead 
four pictures which completes the Pair. Another procedure 15 ter 2° 
ted а story, then 15 or 20 minutes la 
questions about the story, 


GROUP TESTS OF INTELLIGENCE 
399 


picks out 
o 
Pisce. Аи о: others the one which is similar to the first 
procedure may be carried out either with words mecs E 
ctures. 


Which 

of the five thi i 

laho: e Eve things below is most like these three: a tent, а flag, a sail 
2 aship 3 a staff 4 a towel 5а торе a 


( ) Ho 
use 
( )Cave ( ) Barn ( )Hotel  ( ) Store DWG 
[“ Mark three that are жал E 
4 on 


hich have been successfully used. Right 


and le 

deme га anagrams, mixed sentences, recognizing visual units i 

Gitectione terns, range of information, dividing visual figures hard 
several a using alphabet, giving the genus of a named species and 
tions, the m Of all these, only four will be described. In hard dues 
question a phabet may appear at the top of the page and then а 
5 as “The first letter to the left of the 10th letter is—?" 


Th 
ere 
are many other forms w 


арга, ЫЙ; 
3 E ms have possibilites of great complications: 
Ms С са 2 аш What is the word? 
54 Lg [Kuhlmann-Anderson] 
range-of-i i igi 
: my Alpha: nformation test was used as a member of the original 
Cghorn i 
а а kind of i rabbit 2 chicken 3. cow Ж horse 
ne | [California] 
ound in: 1. flowers 2. leaves 3. seeds 4. petals 
[California] 


: Toots 
Revision of the Binet- 


d in the Stanford 
tences and then making 


Mix 

. Xe ' 

d sentences were use 
f unscrambling sen 


1mo 
Lo tests. It is a question 0 
Judgment about them 


child 

г 

€n room of the out ran six 

[“ Магк first and last word of corrected sentence."] 
[Kuhlmann-Anderson] 


ful test forms in the light of 
the two-factor one. Spear- 
rs in his explanation 
. Factor § approaches very closely 
rman then inquired into the 
ed with g. He found 
ns, and (2) eduction 
set down anda 


Who h 
e : 
t lost girl pencil the another bought 
e success 
et forth was 


Su 
the РРозе we consider all thes 
hasized two facto 


eor 
{ n Y. One of the first theories 5 
of : 
Intel}; 
elligence, factor g and f 


u 
Sual term of general intelligence. Spe2 
that were heavily load 


ат, А 

"v oe of those tests 
pi px, ples of explanation: (1) eduction of relatio 
la icelates. In the eduction of relations 
Site? 11 discovered between them. €£^ «t often—seldom (same or oppo- 
might give the word “often” and 


ji» 
as In eduction of correlates 972 “Sheep: mutt s 
eep: on::pig: 


Ог] . 
(1) la 15 opposite, or analogies might be used аз, ` : 
eef.” When we consider our statistically 


mb (2) meat (3) pork (4) 


MEASUREMENT Or INTELLIGENCE 


eral 
ali- 

2 
est, grades 5 to 8, ( i^n (3) 


&2 for 
Test 2 
ence Tests, Test 1 for grade 4, 

T grade 6, Test 4 o 


it Alpha 
s t: 
for grades 7 and 8; (4) Detrol 1 


Нісн ScHoor— 
t forms already 
ool. The ге] 


iscern, 
nearly all test maker: 


-GRADES 
The same tes 


9 THROUGH 12 
in the high sch 


use 
also е 
Mentioned for grades 4 to 8 e егеу 
ations expressed are more subtle ar ccessfu! 
Among the test forms found su 


site 
T oppos!t 
S are analogies, arithmetic reasoning, 
Т sequences, 


peace—happiness: . War— 


bellico ет] 
1 Sorrow 2 fright 3 death 4 pint? 
5 trouble br 
tree is to forest as Person ig to б women у couple 8 немее" 
| 9 crowd 10 men [reran Пий anl 
Japanese Japan Russian Dutch Serbia Spanish nn-Ander 
[Pick Cut both relations—Kuhlma | вооЁ 
1 
1 Wor 
! Permission for items from Terman-McNemar Test. fom: fhe 
Company, Yonkers, N.Y, 


GROUP TESTS OF INTELLIGENCE 401 
Arithmeti 
metic re H s © 
акен; reasoning is so well known that only one illustration will 
of a second, how far can he run in 10 


Ifa t 
DOY ca 
Seconds» in run at the rate of б feet in 14 
[Otis] 


Opposi 
posit " А " "M 
©з are again used both in words and in pictures: 


Obt 
Use—] ; ` 
accessible 2 abstruse 3 acute 4 corpulent 5 agile 
Ату а [Terman-McNemar] 
Б apillarity 2 consanginuity 3 gravitation 4 magnetism 
[Pintner] 


B ы 
Tepulsion 
Vo 
Cabularv conti А В m 
lary continues into the high school its usefulness as a test form: 


diurn 
4 daily 5 monthly 


ed weekly — 2 yearly — 3 nightly 
"eumbent---4 КА | [Pintner] 
cumbersome 2 curved 3 reclining 4 saving 
[California] 
5 churn 


CUurdle 
4 condense 


Essi ene 
Coagulate 2 spoil 3 snuggle 
[Terman-McNemar] 


Nun 
прег . : 
bei sequence is at the high school level as well as in the previous 


ades 
one of the most useful test forms: 
(c) 1846 0024 (е) 12g 


14 
A oae 
Ж X g 

M Ms 3— 0 9 OH 

32 29 [Pintner] 

60. ss 21 22 17 12 [cross out wrong number—Kuhlmann-Anderson] 
51 49 40 37 [fill in gaps— California] 

e others whose use- 


cepted test forms are. hose t 
er, logical selection, classification, 


15, similarities, and memory. There 
t answers and logical 


Alo 
ng wi 
ful ч! With these generally ас 
S unquestioned: best answ 
n 


‚ isay 
between bes 


Tar 

1 П 4 . 
È n = sentences, hard directio 
us c at deal of difference 


« 


h 


Etter pi 
5 Sina à shilling than lend а half crown" means ^ 
z etter а penny than a copper. 
3. tive " Blve the wool than lend the whole sheep- 
Ы nee to the big. 
min ү i i "e: 
shill; g grows bigger with years- — 
in y 
8 will buy a crown. 
above and those 


en the illustration 
r what the thing 


Ung ce the sli 
e s] "ет twe 
er je slight difference be robles is to discove 


a o 
мау p selection. Here the pro 


A as 
Drig 
"ы tria m 3 glass 4 octagon 5 pentagon 
Co ngle 2 parallelogr [Pintner] 
7 friendship 8 adjustment 


respect 
ES [Terman-McNemar] 


Pn 
9 mis 
i: se à 
law always involves: 
10 violation 


MEASUREMENT OF INTELLIGENCE 
402 


s hich 
i icture W. 
Classification involves the crossing out of a word or p 
does not belong with the others: 


1 trapezoid 2 cube 3 triangle 


Pintner. 
4 square 5 rectangle [ 
6 large 7 tall 8 high 9 short 


Хетаг. 
10 low [Terman-McNe 
evels: 
Disarranged sentences are also useful at these upper 1 m. 
i om 0. 
Mark first and last word in correctly arranged sentence—children ro 
ran six 


entence K on. 
[ Каана 1] : 
ri Í a this close at the put sent, 


Hard directions Were used in our first group test: ИТ 
а " letter whic 

(Alphabet printed at top of Page) Write the letter which follows the posl 

C in the alphabet, e order. P! 

i the digits in the reverse Anderson 

wple? 2 6 = 30 [Kuhlmann- 


е dis 
: »s аге 
hree words or pietas dd 
H H H H deus S 
rd or picture agreeing with this likenes 


[otis 
large, red, good—heavy, size, color, apple, very 
(In pictures) hammer, ar 


pater 
" ass jar, М“. 
avil, nut to fita bolt—electric light bulb, glasi топів 
tap, and rolling pin. 


Safety—key. 
> Testing—aco 
er these pairs д, 
ect key from а 


ice; powe 
8raceful—swan; clear—ice; P ring? 


. — sp d 
г 5 
Th; base—triangle; ree inc 
Те read, the word safety ee ok y 
mong other pictures if his 


5 

A reece 
known to the vas majority of students, In most cases the abtle теа 
passing of the tests iny Péttapion. of sapra ор less s 

tions existing between 


] 
Materials already experienced. ner? 
Suitable tests for this leye] (grad 


Ge 
intner 
Ades 9 to 12) are (1) Pintn ar Теб 
Ability Tests, advanced test, grades 0 to 12; (2) Terman-M ен и" 
of Mental Ability; (3) California Test of Menta] Maturity, се Tes 
series, grades 7 to 12; (4) Kuhlmann-Anderson Intelligen 


GROUP TESTS OF INTELLIGENCE 403 


nced examination 
, 


Brades 9 w(K А 
to 12; (5) Otis Group Intelligence Scale, adva 
Mental Abilities, 


Srades 7 to 1 
2 $ ius АИК 8 ў 
РМ A, ages И e (6) SRA Primary 


General Characteristics of Test Forms 


Fro 3 
dir usi. consideration of the pages of description of tests in this 
test Шакет os get a fair understanding of the types of test forms which 
increasin ave found useful. In general, they all are passed by an 
н ратор а of subjects with increasing age and all of them 
est forms atom the total test. Relations are expressed in a variety of 
€ most fre in several media. Visual forms and word forms are by far 
analogies o quent media in which test items appear. In some tests, 
Nonlangu r opposites, for example, may be given first with pictures 
Used with age) and then with words (language). Pictures are widely 
Y some d students before they learn to read and with those who 
and readi nvironmental condition are handicapped in their vocabulary 
9f develo ng development. The same test forms appear at many levels 
to the Dr O The increasing difficulty of these instruments is related 
incteasing ran subtlety of relationship between the facts or words, the 
egree ur rareness of the words or other materials, and the increasing 
selected arily of the several answers from which one must be 
of relation ith some exceptions, tests are dependent upon the eduction 
Б important to the eduction of correlates for the successful answers. It 
at to observe that all tests use largely the same forms and 


the w 
Check superiority of one test over another depends upon the careful 
electing items which chal- 


in i 4 at a 
lenge ae of each item and the ingenuity 1n $ 
ectly the mental functions desired or, more precisely, that corre- 


at 
es i * ar 
t with the desired criteria. 


USES OF INTELL 


At 
tk ачу beginning of the child's 
е the e first grade, intelligence test 
араш Se tests measure, they measure 50 
"егез to learn to read, write; and figure. 
is Sales a child enter school when he 
chil E is in terms of chronological years, 
З ren of the same chronological age differ gre 
xample, one student comments 4s follows 


= 2,106 subjects in grades 1 to 


IGENCE TESTS 
entrance into the formal school 


s are of primary use. Whatever 
mething that is related to the 
The law of the land usually 
is 6 years of age. Note that 
not mental years. 
atly in mental age. 
upon McNemar's 


Revisio Terman-Merrill 
Measurement (E. F. Lindquist, ed.), pp- 


1 

9.. Cook 
n, ш W., in Educational 1 
ington, D.C.: American Council on Education, 1951. 


402 MEASUREMENT OF INTELLIGENCE 


Classification involves the crossin 


i which 
g out of a word or picture 
does not belong with the others: 


‚үтеп 
[Pintne 
; i tangle а] 
ezoid 2 cube 3 triangle 4 square 5 reci -McNem 
E ся 7 tall 8 high — 9 short 10 low [Тегтап-М 


Disarranged sentences are also useful at these upper levels: 


ut 
К ө of the 9 
Mark first and last word in correctly arranged sentence—children room Anderson. 

ran six Еа filer] 
period of а this close at the put sentence 


di 
r whic 

rite the letter which follows the lette otis] 

comes next after C in the alphabet put 


И rder. 
— —— the digits in the reverse О son] 


ler 
2 2 6-30 [Kuhlmann-Ande 


large, red, Eood—heavy, 


jotisl 
size, color, apple, very 
(In pictur 


er 

sar, wate 

es) hammer, anvil, nut to fit a bolt—electric light bulb, glass Ja fornit 
tap, and rolling pin, [Cal 


:eo: powe 
; Braceful—swan; ite | springs 
Th; base—triangle; circles- 


оту” 
se hi em 
mong other pictures if his m 


т 
cu 
i d, 0с 
above, whose use is widespreacs 


ly in one test battery. f test? 

21 ations and fr battery ° als 

It is clear that good tests of italics A cae of ett ы 

known to the vast majority or Students. In most cases the Succ ela“ 
passing of the tests in Olves the Perception of more or less su 1 

tions existing between materials alread jénced га, 

Suitable tests for ee 


need test, g 
(3) California Test o 
12; (4) Kuhl 


of Mental Ability; 


у 
7 f Mental Maturity, 4С в, 
series, grades 7 to 


A e 
mann-Anderson Intelligence 


GROUP TESTS OF INTELLIGENCE 
403 


grades 9 to 12: (5 
; (5) Otis G Intellig 
grades 7 (5) тор ntelligence Scale, advanced examinati 
to 12 (self-administering) ; (6) SRA Primary Moe e 
lies, 


P 
MA, ages 11 to 17. 


General Characteristics of Test Forms 
ages of description of tests in this 


Fr 
om the consideration of the р 
f the types of test forms which 


Chay 
pte 8 
test meas can get a fair understanding o 
rs have found useful. In general, they all are passed by an 
h increasing age and all of them 


est fo d tions are expressed in а vari 
a Эме = in several media. Visual forms and word forms are fe 
quent media in which test items appear. In some tests 
given first with pictures 
. Pictures are widely 
d and with those who 
eir vocabulary 
t many levels 
ts is related 


learn to rea 
handicapped in th 
forms appear à 


nd 

readi 

of ding development. Т 
hese instrumen 


eve 
to the ырш, The increasing difficulty of t 
inerea р cater subtlety of relationship between the facts or words, the 
ds or other materials, and the increasing 
h one must be 


from whicl 
endent upon the eduction 
cessful answers. It 
largely the same forms ап 


depends upon the careful 
"tems which chal- 


that corre- 


а 
de bs rareness of the wor 
Eis imllarity of the seve 
| T elation ith some exceptions, 
th port s and the eduction of correlates f 

ар ПЁ to observe that all tests use 


(s e ый 
есп Superiority of one test 0 


ral answers 


tests are dep 
or the suc 


л 

Кын Мм, item and the ingenuity in $ 
© best y y the mental functions desired or; 
With the desired criteria 

At th USES OF INTELLIGENCE TESTS 
qu of tn ery beginning of the child’s entrance into the formal school 
e S the € first grade intelligence tests are of primary use. Whatever 
рас 5e tests measure they measure something that is related to the 
| те. The law of the land usually 
Note that 


Teg city t 

q 0 learn to read, write, 2? те. Ó yen side 
e is . 

Mr not mental years. 


h ч 
чу Quirement i :cal years 
$ en 3 cal y , f, 
à tiam terms A differ greatly n mental age. 
; McNemar's 


Fo Udren .. атаве 
т of the same chronologica f cars урот 
he Terman-Merrill 


St e 
ч ха; 
ny о за one student comme ted by € 
Ilona? 06 subjects in grades 1 to 12 teste y 
1 
9., Co 
"19 Cok NT А 
0. Wa Walter W., in Educational Measurement (E. . ed.), pp. 
'ngton, D.C.: American Council 0n Education, . 


404 MEASUREMENT OF INTELLIGENCE 


d in this 
One may conclude from these and other data presente 


indicated 
How great thi also been clearly indica 
study of 4,393 


* Not all of them were 6 years 
since they varied in age from 5-4 to 13- 


in the 
of age 


Der cent of those 
those whose M.A 
whose M.A s ar 
carefully selecteq for this f 
hazardous, he uses 1 
Webb апа Shotwell present interesting illustrations of - nor s 
tests with children of superior ability, with those slightly belo were mo 
and with the definitely feebleminded whose individual needs 8 


nd n n 
`5 ranged between 5.8 and t child? 
5 were below 5-82 Recent studies show t jals 
s mater nd 
5-0 may be taught to read if the | ly slow 4 
àge. But the going is most certainly 


а r 
етер he: 
ther cases are presented = апо!) jp 
ir children in ae succes? 
evels were clearly inadequate fo 
he first grade. yonk? 
. T re f 
! Dickson, V, E., Mental Tests and the Classroom Teacher, pp. 96-9 
N.Y.: World Book Company, 1923. 


, 


"d 
” тает. 
* Davis, H., “Intelligence Tests in Public Schools in Jackson, p. 131 
F'earbook of the National Soci for the Study of Education, Chap. III, P on 
Bloomington, Ill.: Public Sch Publishin ompany, 1922. 5 
3 Webb, L. W., and Аппа Markt S 
pp. 114-116. Мез Yark. 


y 
entar) 
1 hotwell, Testing in the Elem 
Rinehart & Company, 1939, 


ROUP TESTS OF INTELLIGENC 40. 
GRO EST: NTELL ENCE 
5 


Enough h а ы 
secured frm suc tests as have b - ped od pacis 
tari h tes e been previously describ 
айы ee Meg a zh са the оша io ced dme 
MERE cd qu nal instruction in reading, writing, etc One 
reading gh urse, that factors other than intelligence enter i 
their .diness. Intelligent parents who read to childr E 
edly eu emn take them walking and tell them аша 
та ито lear : ildren's vocabulary level and thus make them more 
Жер. о m But even here intelligence is related to the num- 
Кө» "e and the understandings acquired. 
into those c nd place, intelligence tests are helpful in guiding students 
ра qw where they have some likelihood of being successful. 
correlated with nown that certain school subjects are much more closely 
Aili cave A intelligence tests than are others. This means that those 
ег 2 on intelligence tests also do very well on these subjects; 
age marke ve medium scores on the intelligence tests get about aver- 
on these subjects, and finally the lower third have a great deal 


of diff 
icu i ; 
ilty with these subjects. In the elementary school, composition 
arithmetic problems usually 


Teadi 
ae understanding, dictation, and 

Subjects ig with intelligence tests of .5 to .6. In the high school, 
tighly ott as mathematics, Latin, and English composition are 
especia] ependent upon intelligence. Professor Thorndike, who gave 
algebra attention to this problem, thought that the correlation between 
ей, and intelligence in the high school would ordinarily be in the 
Near Prhood of .70,! although а relation of A5 and .50 would more 
coe what is usually found. | 
ook for a moment at (1) the intelligence of those who elect 
and (2) the intelligence required of those 
15 of intelligence do those persons possess 
ording to one investiga- 


onometry? Асс П 
1 students electing solid 


{ high schoo ele j 
m the upper fourth in intelligence 
in, natural science, 


arioy 
Who 18 hi 
h nigh school subjects, 


Wh. Pas 

Ke e ect пе Courses. What leve 
wa no geometry and trig 
у etry than three-fourths 0 

& d less eM trigonometry come fro? 
ураш an 10 per cent from the lower fourth. Latin, natu x 
п the 2 2nd French also drew heavily from those with high intelligence. 
intelligence d ients for those boys 
i The highest I.Q.s were 


Dagar S 
Do g Tep place, the median ! | 
S n sch j a varied widely: 
ool subjects als? БЕЙ, followed. next by the 1.05 of 
3 


th Se 
186 Who by those who passe Б 
ba, borng: Passed ancient history and algebra. 
ny, jg dike, Е + Algebra. New 
ay 1923 ^ . L., The Psychology of Alge?re- 
a Factor in th 


York: The Macmillan Com- 


e Election of High School Sub- 


he Power 

tg » Ets, 

а è i Solon! Res “Intelligence 85 e 

Nog . s evlew .452— 4 . | | | 

Ce in теп, I. N., Aree re of Intelligence Tests to Educational Guid- 
1922) 30:692-701. 


ing; 
i 
eh School," School Review 


MEASUREMENT ОЕ INTELLIGENCE 
406 
1 f education, the 
ous levels о; ducatio t | 
At ^ pen most exacting in LIE Ta 
ED and the foreign oo Yan dwriting correlate very schanich 
genes, dwork, drawing, qd 
handwork, E hool level manua : "sence- 
ntrary, - At the high schoo п ith intelligence. Ў 
елсе scores, arts have low correlations ас нта ой a 
)s of er 
Tos ot those who Pass them are measurably lower, d 
those electing them are below average ї 
subjects at the high schoo] level are 
who are Somewhat below 


t 
mos 
e 

tory is repeated. In wem 
oni moe tempus e ol, on 
In the elementary sc 


5: 

udents were divided into three group h 

group, 1.Q.’5 85 to 89, (2) + 
5 to 104, ang (3) the h improvemen 

14, then the higher the “s Бтоцр the greater the impr 

i les, Those о 

» and to dern forele 

Опе before taking Чр the study of mo 

showed losses in 


six 
Average сар 
! Werner, Oscar H,“ he 


uae 
ign Lang Ж 
Influence of the Study of Modern m Moder” Сол 
on the Development of Desirable Abilities in English,” Studies En Сапай D 
Language Teaching, Рр. 99-145. Publications of the American an any, 1930 ; Не? 
ittees on Modern Langua es. Ni Ork: The Macmillan а Чо 
E Jordan, A. M., Educational Psychology, 3d ed., pp. 292-293, 
Holt and Company, Inc., 1942. By ermission 


GROUP TESTS OF INTELLIGENCE 407 


Bu а А 
ру e e e Ж. 
exception of tests of punc i 
rene _ It is they who really а а Pose n 
ees ave the mental capacity to see relations and contrasts 
mnie e two languages which enable them to make large im- 
ел nts in scores in grammar, language usage, and reading. It 
Тее "i be made a universal that no student whose I.Q. is below 
ild be allowed to register for a modern foreign language. 


RESULTS OF EDUCATIONAL GUIDANCE 


Seep on page 404 the results of such careful consideration of 
visible m ifferences and the adaptation of courses to them produce a 
the oe easurable effect. For purposes of contrast we shall introduce 

Š elimination of students when there is no definite program of 


guid E ats 
ance and contrast with these conditions the results when guidance 


а 
5 been effectively done. 


АБ 
LE 13,* 
3.* RELATIONSHIP OF SCHOOL SUCCESS TO BINET INTELLIGENCE QUOTIENT 
followed up for 6 or 7 years) 


131 hij 
high school pupils tested in 1916 and 1917 and 


I Number of Completed 4-year Left high school 
©. on Stanford-Binet scale | cases m highschoolcourse| to 8o to work 
Lp Number | Per cent | Number Per cent 
125 
11 bd (very superior). . + 19 19 100 0 0 
105-125 Guperior)....... ce 27 26 96 1 4 
Жы, (above average). .---- 24 20 83 4 17 
85.9, (average). ..... eee 36 27 75 9 25 
78-84 (below average).....--- 22 9 40 13 60 
(ШШШ NEM T 3 0 0 3 | 100 
UN aite 101 11 30 23 
dance, Table II, p. 31, Riverside 


P 

Te etor, W д ; ational бий 

“books in iria am К Уч Mifflin Company, 1925. 

the able 15 offers evidence that the bright continue in school 2 that 

а р. ате ао of 19 students with an 1.0. of Hs or Ern 

With Shed high | t this record with that of the students 

igh school. Cont 25 were able to finish high 
hown in Table 


Scho, ато ly 9 of these 
ol - of 85 and below. ОПУ -nilar story iss 
9r about one-third of the total. A simi ar story 2 instead of the 


1 this concerned is 0 
Tho die d ders the number © run in 2 years the trend is inescapable. 
e high school. Bu res and length of stay in college 


Orre]a н : 
ation between inte igence © 


408 MEASUREMENT OF.INTELLIGENCE 


oN- 
THE C 
TABLE 14.* ТнЕ RELATIONSHIP BETWEEN SCORES ON Oris TEST AND 
TINUATION OF STUDENTS IN COLLEGE 
(Study covers a period of 2 years) 


> е 
Those remaining | Leaving petor 
Total num- 2 years 2 years 
I.Q. derived from Otis Tests ber of " 
tudent: er cel 
— Number | Per cent | Number а 
28 
115-124 (вирепїог)........... 158 HS | 72 43 38 
105-114 (above average),..... 247 154 | e 93 43 
95-104 (average)... 103. 60 | 37 43 58 
85-94 (below average). cs.. 43 18 42 25 82 
75-84 (dull). 11 2 Bod Eau 
Таїна oos uminn se a. 562 349 62 213 
vary Holt 
+ Jordan, A. M, Educational Psychology, 3d ed., p. 520. New York: Heny 
and Company, Inc., 1925, 


allow” 
ram of guidance. Another result of ? 


Satisfies neither group. 


pnt 
How much better for all conc 
which to choo. 


› e x then 
guidance is reflected in th 
€ number out at work. 


Out at | Out by | Faited one! Failed sedes 
work transfer subject | more subjects 
SSS eme а, Si aan ННЕННШШИНА — 
Cide. v as. *5 | si] ms "c 
ОпвшЧей...... 12.1 13.1 8 10.3 
1 Proctor, W. M., Psychological 


Tests and G 


pa 

is, Jour’ a 

uidance of High School Pupils, scho? 
graphs, N 


of Educational Research Mono o. 1, Bloomington, Ill.: Public 


Publishing Company, 1923. 


GROUP TESTS- OF INTELLIGENCE 409 


“Т H H 
A i: e of. failures for one subject from 30.8 per cent to 18.2 
cially notano a two subjects from 10.3 per cent to 0 per cent is espe 
thar оа y. The bases of guidance in this study were far broader 
бав nce-test scores, but these latter undoubtedly entered into 
Mr oncerning the choice of subjects. 
шше п = tests are also used to help students decide on various 
җи study. Since the members of some courses have a much higher 
ge intelligence than others, this information should be conveyed to 


the st 
udent who is soon to enter them. In one case,! the average I.Q. for 
109.4; technical course, 


t 
бай course was 114.5; commercial course, 
Qs д. ndustrial arts course, 103.1; dressmaking course, 97.4. These 

the doen very well the relations between the courses selected and 

fas л Scores. A corresponding report from the city of Saint Louis gave 

Pings ш the scientific course an average I.Q. of 109.8; in the general 

art ih .3; classical, 106; commercial, 103.2; manual training, 102.5; 

› Л; and home economics, 100.5. These two studies make clear 


that : 
the scientific, language, college-preparatory courses enroll on the 
ts than the other courses. This same 


ау 
nga more intelligent studen j 
e Uni noticeable at the college level. At Ohio State University and at 
un ту of Illinois the arts, commerce, and journalism drew from 
tistry etter equipped in intelligence than veterinary medicine, den- 
also hi "x: pharmacy.? The average scores on the intelligence tests were 
Versit £h in medicine, law, and engineering. These divisions of the uni- 
medi y Were closely alike. varying only from 141 to 147. Veterinary 
cine with a score of 112, dentistry with 115, and pharmacy with 125 
ements. A student who 


e . 
ne of all in their intellectual requir ] 
Мега, rank in the lowest quarter а5 à student of law might be above the 
Жен of his classmates in dentistry. He would then have to decide 
Yer to struggle along in law or to shine in dentistry. 
HoMOGENEOUS GROUPING 


d the difficulties inherent in 

widely different in their 

{ i ;hich were 
anations an materials w 

pan he bright and confuse 


Suit, 

ab], 

the Pte fo ould bore t 

i г the average of the pe s discussions and materials on 
onfounded. To 


th, cull. Т 
fle f the teacher itched ber clas 
i ie of the bright n of the class would be doubly c tanlet 
T the situation homogeneous grouping of pupils or $ 
ih ns Gi to Grou 
og w К. S,“ chology Tests iven 
tt A, Se Rolt aips w MM 
*peri Ү.С.” Co 
Pig, mental Study of Education- „297. New Vork: Henry Holt and Com- 


Pan, Ntne | 
s In, у Ой Intelligence Test” 


U 

SE or INTELLIGENCE TESTS IN 
Or H 

atte, Шапу years teachers have realize 

tudents who are 


empti 
a a to teach pupils or $ 
es to learn. Those ¢% 


ps of Public 
116. Society 


410 , MEASUREMENT OF INTELLIGENCE 


Thus the same median Score might be obtained by one pupil who was 
good in arithmetic reasoning and poor in language and by another whose 


Cy greater homogeneity than would obtain 1? 
га groups combined. Humanitarians claim that to label a slow go 
y calling them “{һе Z group" or the “opportunity class” is decidedly 


sons also believe that this procedure of seared 
E Я cious of their plight and hence 7^ 

develop in them а feeling of inferiority. Claims nba me made that et 
erior individuals, all of whom pe 
ool also should group them Hber low" 
grouping are activated by the fo ore 


gence stimulate each other much those 


; : genuit 

we 7 : Their progress depends upon the ing?" ; 

rr p aus nus а more eine df materia aa 
n 

Investigate, A summary Pr understandings of topics ee ned 


group are not much mega with more concrete details. The ™ ge- 
pends upon whether the te 
procedures to those more s 


AIDS IN MAKIN 


G 
н Decisions aBour Gomo то Correc? 55 
Intelligence tests furnish evi с 


: id В suc 
of a high school student in сој) ence bearing on the subsequent help? 
to constitute the total prediction ту; mer 
= x i n picture, i int 12 $ 
directions. The coefficients of De е ређа? 


GROUP TESTS OF INTELLIGENCE t 41 
i 


а thousand times or mor 
m more between college marks and i i 
Under ш majority of cases they have bor funr n 
а có conditions one can confidently expect a me des 
as high as 6 en "im scores and the average of school marks.! Сейшел 
Prophecy d: uce the error ot estimate by 20 per cent. If we made : 
Eur. е "s such a correlation our prophecy would be roughly 20 
Bush aio: pi tl an if we had not used the test. However, the test is 
Worked -— pr. than this. Suppose a student had consistently 
that his T in high school but still had made only fair grades. Suppose 
.Q., based on an intelligence-test score, was only 90. This 


Corroborati : 
rative evidence might be the deciding factor, for certainly à 
and even then was only able 


рег 
5 — had done his best in high school 
ent, ек have rough going in college. If on the other hand, a stu- 
shown b к ты frittered away his time in high school and passed, but was 
Succeed y the test to have an I.Q. of 110, would be much more likely to 
in college did he suddenly acquire a new motive. 


‚ Bett эк 
setter predictions of subsequent college success can be made by com- 
of any one of them singly. In a 


bin 

u oe factors than by the use 
tween т: the University of Wisconsin? correlations are published 
Хае ps m e-point averages and such intelligence tests as the Ohio 
ion pg еч ological Examination and the American Council on Educa- 
um А ological Examination. The coeflicients computed with large 
test sis of subjects ranged from Al to .61. By combining intelligence- 
Coefficient and marks for the senior year in high school, а multiple 
per t of .71 was secured. This raises the predictive efficiency from 
cent, secured from à coefficient of .60, to 30 per cent, secured 


hee of .71. 

Feeble, ligence tests have been used /o define more accurately the levels of 

fi mndedness. Before the advent of tests, feeblemindedness was de- 
one managed his ordinary 


ed 

Я In terms of the prudence with which 
usted himself to his environment, or 
f feeblemindedness 


affaj 
his e the skill with which he adj 
are БЫУ to make a living. While these concepts 0 
influential in some quarters, definition in terms of M.A. or 1.Q. 
und. Using the 
] of feeble- 


e 

tected from . а А oe 

5 a standard intelligence test 15 gaining gro 
nt, we may define the leve 

mpanying table. There is pretty general 


fto 


BET ; 
Min 5 the criterion of judgme 


deq 

n ‘ 

ess as shown in the acc 
e unreliability of school 


1 
О: 
mau e f i 
ths actor whi Jations low Б th 
the eop such а ат е вр dramatically in the wide variations of 
м ility is reflec marks for all subjects are combined 
determined by school 


Ing сое 
9 Clents į = £ : h 
Map 0m 5 in single subjects: When the 1 
чк ie Such unit = p point-hour ratio, the standing а5 
Ww. Frog mes very reliabl hei 

f hlich G уыз cli ss at the University of 
ith, Gustav J., The Presi У of Wisconsin, 1941. 


Sco, 
ns 

t. " " 
adison: Bureau of СЧ an 


nic Succe 
Universit 


of Acader 


n 
ce Records, 


412 MEASUREMENT OF INTELLIGENCE 


Level M.A. | I.Q. 
i 0-2 0-20 
TUTGE: oma Sisi cac 
Tmbecile....... -| 3-6 20-40 
NEGEDE аан e 7-8-6 40-65 or 70 


В t the 
agreement about these definitions of idiot and imbecile ms Р е upper 
beginning of the limits of the moron. Disagreement arises li ipper lim! 
level of the moron. Professor Pintner recommended that the F^ .60. The 
be placed at 8 years and 6 months and that of the upper peo ch shall be 
upper limits of the LQ. are dependent upon the C.A. w = indicate 
used in the denominator in cases of maturity. As has been turity- 
(pages 362, 363), 14, 15, and 16 have be bove 
This means that the determination of the 
would use either 14, 15, or 16 as chronolo 
of 70 as the dividing line between nort 
Wechsler uses an I.Q. of 65 for this same 
limits of a moron would extend from 7 to 
70. In the Terman-Merrill Revision, be 
subtracted from each 4 months of C 


include 
10-6 as the upper limit, many more of the population would be in 3 
in the feebleminded category than 


orts: 
when 8-6 is used. Pintner тер 
Similarly when a; 


«ng of 4,925 
pplying these limits to a random sampling of tat 
school children not includin i 


en used as ages of se x 
LQ. of a boy at 15 an a in 
gical age. Terman T wanes 
mality and feeble min him he 
purpose. According to ; and 
10-6, with I.Q. limits o опір і 
ginning at year 13, 1 pin use 
-A. until they reach 15. І 


The present author agrees with this recommendation. 


Uszs or INTELLIGENCE TESTS ron Voc 


СЕ 
ATIONAL GUIDAN 
The leadin 


s u 
2 5 of the intelligence possessed by the individ i 
question, and (3) the guidance of the individual into the vocati? uld 
which his intellig i 


» GROUP TESTS OF INTELLIGENCE 413 
equa 
! а m puris ue secured. Instead of this direct procedure the data 
ноев E gar ered indirectly from tests administered to draftees 
ated are ni ars. Such medians and percentiles as we have accumu- 
machinists puted from the records of those who said they were 
Was merel a procedure subject to error, because an individual who 
machinist Ji machine tender sometimes puts down his occupation as 
. In some occupations, such as clerical workers, engineers, 


aWwyer 

Ж S H . 

, and doctors, intellectual requirements have been clearly defined 
st majority of occupations, 


ter 
rm . 
чац ы intellectual. units. In the và 
, these defined requirements in intelligence have not been 


“termined 
A second di 
ond difficulty arises from the nature of the intellectual require- 


Ment, ] А 

Шеп е З vocations. Wherever large numbers of subjects in а 
cen foun 2 have been tested, wide variations in intelligence have 
ы wa. of: A part of this difficulty has arisen because not enough con- 
tion, < i given to the degrees of competence reached within the occupa- 
part К Cu that “machinists” Was the classification in question; then 
я the variation in intelligence could be attributed to the fact that 


e oft | 
he members of this occupation were apprentices, some journey- 
It is also evident that weakness 1n intelli- 


en occupation can often be overcome 
and tact. In many cases a person of 
n the occupation drags along 
n. Striking illustrations of incompetency 
ssions 45 law and medicine. 

р CCUpa g: d for successful competency 
S ins cause so much 0 test scores that the 25th 
ill e € in that occupation with telligence requirements 
ble. ag 9чу fall at the 75th percentile of 0 below it. For exam- 
the First World War, the 75th percentile of 

ile the 25th percentile 


the 45 ba 
Ў Sed on data from the Firs 
y Alpha, wh 
urse, that the upper 25 per 
ood as, OF 


el 
of €ctric; 
the “cian was 109 points ОП Arm 
.Many electricians 

{ the physicians. 


S 


Cen, ^ Physici 
thet а) кт: was 107. This means, 0 
ty 25 pe ctricians were on the Army Alpha as 5 
doube cent of physicians just below he average 
tig ings intelligence scores ve the average 0 
а а м 3 overlapping of inte 
Ў © i | muc 
Seo, VAS 5 he guidance problems 
age’ p, 88а in the previous aragraphs tha 
288 і і in the а 
tu, Pted wi upations as listed 1 Е 
inbting А some reservation. he arm. eu was 
км Ssential i à ише as possible. | , 
Or саг, 8/8 who Lope as farmers. Farming b, essential 
i sse hi А 
Ying on the war; hence all owner? were not drafted. Most f those 


414 MEASUREMENT OF INTELLIGENCE 


TABLE 15. AGCT Scores ron СтуплАх OCCUPATIONS 


(Based on scores of white enlisted men only.) б 
лл eU EE. LU Ns om A5. 1100 05 130 ns 
= LUMBERIACK 2360S — -— L——À 
бйнм WORKER ras a. H 
2 М: = Syst ——— ИИИ 
М, SEN MINER SO у= - 
x Эгизи 7d 
> CUTABÓRIR 7805: 
mate ZAARA ERSTES = 
E CSRORRIPAJRMANSIIR 7 H 
C TAUNDRYIN (нга йй — 
_Тонб$но) у = ҮҮ" H—1 
CT STATIONARY: JRERA O7 
Gm HOHITRUCKIDBIVER S365 
= Want HOUSE MANISU 
ШЕТУ 


САУА ANDION МАНЫ 1. 
от 
KOOK SN 


| зба 
CONSTRUGI ONIM AH OPTRTGS.- — NN 
ETT. NORSEEREAKTRIST. — 
SEWINOIMACHINE INÉ OPERATORIST- 
 RENEAVYSTRÜORDRIVERA T3: 
ESRVZTSU (VIE "ш 


ИШИ ТҮГӘ Уза. 
ИШЕ ҮЙ тта 
|NOSPITAGORDERCE I nr | 
ZARIMATION? STIONZARTISTANSOT 
UTES 
-SPOWIRIUNENANISU | 
AOENERAIGCARGENTERITOOA WM NN 
EXIGIR f ҮН ЕМ АНТИ? Tre | 


сината STIMAXERTGÉ: 
SPIRIT ER ga) 


x esci | | 
MGRGWE(DE Rte 32; 78 
BILD уйн. a 
Ex imei 
Burimi 3 
ШЕПТЕН d сато 8 
ЧОЕК 
ЖШН ЛТ ТТ S 57 — 
AIT REPATS M 305 Паи | 
RAIOWAYGBRAKEMAN TET їла 
ОЕМ AR ERT ГЫР ИШЕНЕ] 
(ИТҮ СИП [CUT gd 


Burro LC ÍNSTRU SER SO 
SHÍEISMETAL Maj MANU) - NEN 
A SHIP НЕГЫ 


Ry, 
Sa 


GROUP TESTS OF INTELLIGENCE 


415 


T. 
‘ABLE 15. AGCT SCORES FOR CIVILIAN OCCUPATIONS (Continued) 


60 s т 75 E 


95 100 105 Mo 115 120 ns 130 


135 мо MS 


50 85 
T 62 


[ ]a 


- 


6 90 


95 


CONS: EQUIPPMECI-S3 7 


CORGUSHRE FIGHTER: 20. za 
- MACHINISTS TA n 


POLICEMAN =; 
іангоме 2, 
Е — MANUACARISSTUD'EO- 
= SHIBPINC. nie — 
WATCH REPAIR REPAIRS —а! 
DINTAITCABITÉCHBS NN 
ТТ SIGNÍPAINTERISS- 1—1 
CTAERÍAUTPHQJOGG03 a] 
оската 
E MACHINIST | 
= MOIEPIC ШАШ 
CEGIMREREMIHT: 
TOFRMACHINE REPATRISION g] 
ТНСГАГЩЗОР?183: == 
OM MUS SUIS m 
CESIEWARDIÉR: — 1 
ESUR E = 
SSP PRINTER I T 
SHIPPING CLERK 408 — 
TINSTRUTAUSTGTON — 
тоон АКЕ 
EPI CACAD:NZ :STUDE ma 
‘PHOTOGRAPHER 7 HH H 
BAND! LEADER'S? S TF 
ШИП СШ IIS 
> SHOPICLERKIBS I— 
ЫНЫ ТОГЕ ДИП H 
a 


з 
y 

36 

з 


иг! р 
STORRMANAGERSES 
[C S MECHISTUDI66: yes — 

SALESMANIESSE. 
PURCUACINITST- — 

CATHIERIGSE = 
@иийк??083__ 

RADIO: REPAIRIISE 
[T PR00JM ANAGIS4 д 
EREMACHINELOPERSO, 
SPOSTANCERS I, a 

seri EE | 

MCI UABTAJD!134. — 
DRAPSMANN? Dm 

ИМАН}. nm 


d ТИШЕ 
NPIS IS, 

SPRARMACIST HAPEE 
SCENIBKKPR 302 u 
E STENO-206 5 
TEACHER 360. 
ENNER 


WRITER: ДШ a 
STUDIA 


100 


ano My 


PERCENTILE 10 


vine er’, 
SN 
Tanual, First Civilian Edition, 


Sear 
arch Associates, Chicago- 


25 
per, 1948, page 8. 


Novem 


ТИН ЖТШ0Я3 H 


om 


416 MEASUREMENT OF INTELLIGENCE 


s or 
classifying themselves as farmers were either hands on ep 
renters. Therefore the intellectual level of the farmers had worke 
affected by the manner in which the draft worked. If vu been 50 
alike in the case of all occupations the results would not vs OCUP? 
badly affected, but there was a differential effect upon vario 


а) 3 
‚ Qı (25th percentile) eps 
were made by the army 


my 
e art 
en accumulated and th for all 
. (0) 
а: ( 
occupations. Two inferences seem warra 


oes not differ greatly fro' 

€ converted into AGC 

of 100 and a standard deviation of 20- ion$ 
Table 15 shows the Scores received by men in various occupati 5 

this test. The black bars j 

Percentile to the 75th p 


: Sth and ab 


on the other Obser t (1) the range 5€ iow 
the 25th and the 75th 9 ve next ( res И 

percentiles, e f the sco nê 
ran beyond these points C sud (2) the extent o i 


» sed 
ompare the middle 50 per cent of ae it 10 
t 30 points, with that of the writer, ion 
ap between оссир“ (ре 


ta srjacKS: | js 
ghe ce für occupational as Емс Some lumber 
higher than 11 Рег cent of the accountants um ү? 
From such Considerations + © Suggestions about entering "уй" 
occupations obtained from intelligence. test scores must be highly da 
tive. Suppose a Student гесе; мут а 


enter medicine. You might 
medicine are rather g 


ёр” d 

иссе”. 

5 “Your chances of 8 dic 
lim. y 

! Memoirs of the Nation 


С met 4. 
he 25th percentile О 7, 492 
Vol. XV, Part III, Chap- 


GROUP TESTS OF INTELLIGENCE 417 


stud 
ents. On the other hand your score is at the 55th percentile for 


Phar = 
отан and the 60th percentile for salesmen." 
usefulness of the tests for guidance varies with the occupation. 


Uuccess і ч 2 * 
tellizen in some occupations is nicely correlated with scores on in- 
‘Clligence tests. Executives’ success is closely dependent upon their 


Intelligence, 
intelligence test was given to minor executives in 1915, and 
ared with the firm rank. The 


bob 1920, and the results comp і 
en tion was .69. A small group of executives at the head of a 
abilit n were ranked by the vice-president as to their executive 
80.1 y. The correlation with their rank in an intelligence test was 


n 
"bea types of occupational activity there is almost no relation be- 
1 intelligence-test scores and success. 
SUMMARY 
tests in comparison with indi- 
The best standardized group 
cy of the individual 
a rough intellectual 
ed the Army Alpha. 
lliterates and those 


ms раш disadvantages of group 
Inte] nests have been largely overcome. 
est, ege tests approach very closely the accura 
classifi ut of the needs of the First World War for 

Ssification of a large number of men there develop 


“us te x 
Ine test for literates, as well as the Beta test for illiterates 
plied in a large number of situations during 


this English, were ap ım І 
аг. The data thus collected furnished living evidence of the value 
From this beginning the construction of 

rward by leaps and bounds until today 
ble for every age from 


gr s 
ro» CUP tests of intelligence. 
roup tests availa 


u 

ке, of intelligence went fo 
3 Years : carefully standardized E 

o maturit . | 
thre © detailed зя of three series of intelligence tests displayed 
агер У Des of test construction. In one of them, the Pintner series, 
Struct; Statistical anal: dis was made at every Stage of the test's con- 
ĉasi] lon and devel y ust how good а test it is, then, can be 
—— J Anderson group tests were also 


i ctermi & ann-And в Were 
eit кай, з» КШ nore heavily ОП the subjective judg- 
* in tests, than on more 


Y const 
ructed but leaned П 
теры the кей for years 
; А au d worke а а . , 
ins br orar NE, he third tyPe> РМА, divides intelligence 

© five ates 1 1 

ilit, "€ abilities which are fairly 1 ependent а 
А еасһ 

st : 3 

ET acd was then made of the test forms m 
M Tucting intelligence tests suitable tor 

" "rt 

Min t, Harold E., Principles of Employme” 

?mpany, 1926. 


nd computes the relia- 


found useful 


h had been 
grade levels 


he various 


į Psychology, P- 279. Boston: Houghton 


MEASUREMENT OF INTELLIGENCE 
418 


and 
9 to 12) Я 
4 to 8 and grades ! d tha 
з rades 1 to 3, grades | didis: 
ane found suitable were introduced. a w ith younger 
LE ful at all stages even thoug : osites: 
i ere useful a ptor Я 
certain test forms wi ed in pictures. Analogies, bes 
Е lations were expresse picti I abulary, bi 
children the relat : lecti lassification, voc asing 
m mpletion, logical selection, c in. Increas 
number co ple MS à ir again and again. lations 
d arithmetic reasoning occu g уе 
dificult is attained by using more subtle and more unus 
a aby д the possible answers more nearly alike. 
an 


а 1ds of 
individual, have bp pe а 
supplementary апа ei in guid" 
mine Шун qum 
k where they can wor and th€ 
ents like school better, ing indi- 
ts are useful in e im student 
rstanding the failure o z 


tests in this very promising area of guidance. 


QUESTIONS AND EXERCISES 


S were leveled 


5 

form 

1 est А 

б. Name and illustrate six © 
t of intelligence? 


in 


sts he 
used in constructing group ee E e 
Se criticisms met? telligence for the ааа Ww mes’ 
2. Describe the salient characteristics upper grades, and high sey of all t 
of the Army Alpha Test. What test the common characteristic: in 
forms were used in its Construction? tests? М tests us? the 
3. Illustrate the influence of Tange of 7. How are intellige. i ren 
Subjects on reliability by reference to the first grade? With ch :gen ct 
the Pintner-Cunningham test. Explain Primary grades? f intellig se 
the new technique, Introduced by Pint- 8. Discuss the uses 0! elect punt 
ner, for Computing the I ompare tests in aiding students to 5 $ 
with the older method. 
4. What criticism: 


s in elem elli 
of study, What subjects d with inta ye 
5 were made of the school are highly correlated w; 
Kuhlmann-Anqer on 


{5 e 
pjec m 
: Sence-test scores? What subje” o sa 
la of validity diq these 


P Я ti 

only low correlations with pe 
appears in scores? 

their treatment 


jon "о, 
Jatin uc 
9. What has been ipee and colt 
5. How do group tests of intelligence tween intelligence-test 5С 
suitable for the kindergarten and enter. 


ü jn 5€ thi? 
cess in school? Continua o ow 8 
ing first grade differ from those intended Number of subjects faile 
for grades 5 or 6? 


nt 
Problem being met at prese 


ee ЖИИ. 
EMEN 7 
— c [—— M——À9———— 


GROUP TESTS OF INTELLIGENCE 


10. 
ees коў have group tests of intelli- 
ош а used to form homogeneous 
en ive two reasons why these 
Ses rod not as homogeneous as they 
iL ра favor such groups? Why? 
iine at types of scores furnish the 
19 omar of college success? 
Tes rl have intelligence tests been 
efine feeblemindedness? What 


419 


are some difficulties i 

peres es present in tryin 

ie e upper limits of еа 
13. Describe the applicati 

test scores to the ur ek ms pa 

vocational guidance. What aic = 

present themselves when we sm xi 

decide the amount of intelligence n E e 

for good performance in any exe 


BIBLIOGRAPHY 


Books 


Bun 

Шеш eee К. (ed.): The Third 
ike To ca Yearbook. New 
ten, 1949. N.J.: Rutgers University 

UR 

труп Hanorp E.: Principles of 
ought vent Psychology, rev. ed. Boston: 
Crom, Mifflin Company, 1942. 

chological ТА. Ler J.: Essentials of Psy- 
агрег Testing, Chap. 8. New York: 
готель 

ed. Cha AN, Е. N.: Mental Tests, TeV- 

Mittin ps. V, VI. Boston: Houghton 

Pure ралу, 1939. 

of Асо цен, GUSTAV J 
one Success at the University of 
се Весе Madison: Bureau of Guid- 

1941, cords, University of Wisconsin, 
Jonp 

i» 3d ©, A. M.: Educational Psychol- 
olt ang ау 13. New York: Henry 

оо, Тао Inc., 1942. 

S; ЕРА EONARD M., and GRAYSON 

соо, Жеш Guidance in Secondary 
Оран CY York: Thé Macmillan 

«Memoir 1932. 

elence, "y of the National Academy of 
$2 ^» ol. XV, Part III, Chap. 15, 


The Prediction 


Moy 
Murs W. S. (ed): Encyclopedia of 
аст ^al Research. New York: The 
тавд Company, 1941. 
He Chaps. Ruporen: Intelligence Test- 
my Hol; УП, ҮШ, XII. New Vork: 
Vo косто, and Company, Inc» 1931. 
М ола c W. M.: Educational an 
i Guidance. Boston: Hous! ton 


n 
Company, 1925. 


Articles in Journals—Manuals 


Davis, H.: “Intelligence i 
Public Schools in fut isnt є 
Yearbook of the National Society for ji 
Study of Education, Chap. III, pp. isis 
142. Bloomington, Ill.: Public School 
Publishing Company, 1922. 
DURRELL, DONALD D.: *The Influ- 
ence of Reading Ability in Group Intelli- 
gence Measures," Journal of Educa- 
tional Psychology (1933) 24:412-416. 
FRYER, DOUGLAS: “Occupational In- 
telligence Standards,” School and Society 


(1920) 16:275. 
JORDAN, A. M.: “Student Mortality,” 


School and Society (1925) 22:821-824. 
—— “The Validation of Intelli- 
» Journal of Educational 
923) 14:348-366, 414-428. 
Anderson Tests, Instruc- 
Minneapolis: Educational 


ence Tests, 
Psychology (1 
Kuhlmann- 
tion Manual. 
Test Bureau, 1944. 
.: “The Contribution of 

to Educational Guid- 


nce Tests 
i School Review 


School," 


‚ Manual for Ad- 
the Intermediate 


Factor in 
Subjects,” School Review (1922) 30:452- 


455. 
SrEWART, NAOMI: “AGCT Scores of 


Army Personnel Grouped by О 
tion,” Occupations (1947) ed 


PART THREE 


Personality Inventories 


— ~ 


CHAPTER 16 
Measurement of Interest 


what activities either in reality or in imagina- 


tio v ors ч 
en Bave ettan ih on ОК 
of the fact that real happi aportance of this knowledge arises out 
enjoyed at real happiness 1n life comes {тот doing well what is 
EUM and out of the fact that if an activity arouses interest it will be 
Portent iu less friction and with more likelihood of success. It is im- 
times i зо because in exploring various areas of interest one may some- 
e iscover new interests which before were not realized. 
not die me the discovery of areas of interests in school children may 
а. vies teachers with information so useful in motivating and 
Жушс c ildren's curriculums within the class but also may aid the 
the co or in assisting his clients to come to some decision concerning 
TET- uTses they will take in school and the occupations they will enter. 
ing Ae not the place to discuss the relation between interest and learn- 
Ex ut these two are inextricably intertwined. ; i 
idee in a very broad way no one has set down a list of desirable 
im s which a student should possess. It is, therefore, impossible to 
of E degree of success 1n those interests which are the objectives 
en ching. The problem in the measurement of interests becomes, 
, one of discovery for purposes not of evaluation but of guidance. 


CHARACTERISTICS OFI 


ly related. 
ouses the mo 


It is important to know 


NTERESTS 

The interest which an indi- 
tive of acquiring it. 
“state or set of the 
d for seeking cer- 


M apes and motive are close 
oti has in an object frequently ar 4 
indi ve, as defined by Professor Woodworth, is 2 
dividual which disposes him for certain behavior an 


am goals.” 
the stimulus, but a set 


Ж, that a motive is n t l $ 
ards a certain goal. Thus, a motive releases energy and directs it. 
Unger is the motive. The food is the incentive which releases а 
атрег or smaller amount of energy in accordance with its attractive- 
Ness. Tt is the attractiveness of the goal to the individual which 
arouses the motive and which gives one incentive prepotency over 
423 


ot the situation ог 


424 PERSONALITY INVENTORIES 


another. John Dewey brings the set and the motive together in his 
description of the latter as a “‘wholehearted identification of oneself 
with a goal or activity." i 

Tied up closely with motive is the matter of interest. Interest 15 
the pleasant feeling tone which attaches itself either to the activity or t0 
the goal. If it attaches itself primarily to the activity, it may be 
called intrinsic; if primarily to the goal, extrinsic. Along with this 
feeling tone there is also in interest an urge to continue the activity or 10 
seek the goal. Intrinsic interest arises either because the activity 
connects up directly with these inherited body needs such as hun- 
ger, sex, thirst, fear, anger, bodily activity; or, because it falls in 
van Mic “oe atterns already started. Extrinsic interest 

€ of anticipated satisfaction in the goal itself.” 


mediocre success b H 
8 ес А or 

ái 4 = А І ater 

in this chapter wes est) is not in his work. La 


ree 
f the relation between achiev 
, but at the moment it j i à 


2 in 

igh : 
5,000 high school students rarely listed the names of salacious bo? 
1 From Jordan, A. M., Educational pg 


york? 
Henry Holt and Company, Inc., 1942. та d ed., pp. 154-155. New 
on. 


MEASUREMENT OF INTEREST 425 


magazines, which unfortunately are frequently read. Spencer! found 
that unsigned questionnaires were answered with more openness and 


frankness than they would have been had they been signed. The second 
elaborate questionnaire about 


difficulty, especially applicable to the 

many occupations, is simply а lack of information about the vocation 
or activity. How could a student really choose which activity he likes 
best and which worse from these questions?? 


g. write novels 

h. conduct research on the psychology of music 
1. make pottery 

ing pottery, and as for 


riting novels or of maki 
he has not the slightest 


He knows nothing of w 
hology of music” is, 


What “research on the psyc 
idea, 

Even worse possibly than total ignorance is the generalization about 
occupations which is frequently made from a few glaring instances. A 
Subject is presented with a list of occupations about which he is to 
express his interest. His imagination has been fired by the extreme 

ations have made. But the 


incomes which certain workers in those occup 
the baseball player who 
ing illustrations of what 


8reat lawyer’s $50,000 fee for one case, and 
Teceives a salary of $70,000 a year are glitter 
t. The student too much 
hat income the aver- 


the incomes in these usually are no 
influenced by exceptional ails to consider w n 
age and lower bracket of workers receives in these occupations. 
Another method which at first glance seems very promising is that of 
i hod has been tried out in a number of situa- 
used observation to check the interests 


e rect observation. This met 

ions. The author, for example, s n t 

of children in books and magazines against their interests as expressed 
t tough the questionnaire. He found, for example, that children wore 
Out the co 4 of some books in libraries while others were clean and 
fresh th ma i ds in the card catalogue were dog-eared and dirty, 
› that certain cares d the poetry shelves 


a д А ind radiators ап 
t certain books were hidden behin ded pe pue 


du { ne children’s nex 
р. 18 they could be secured 01, t by a large number of children day 
үзене а tal records of activities of an indi- 
day. In another case, anecdo ‚== 


аң 

маг $ Iping for the school 

ual, such as those involved in help "a been recorded and used. In 
ome 0} 


n z r 
pines jumper тез embodying 5 f the essen- 
z d out and records made of the 


Mm 

tig) < Schools, short introductory © me 
“atures of occupations ave bee 

„flict, P- 192. 
eference Record, Chicago: Scien 


occupations 
cases fi 


4. Yonkers, NY. World Book Com- 
Dany. oc à Con | 
any, 1939" Douglas, Fulcra of gn ani 


1945 der, С. Frederic, Pr 


426 PERSONALITY INVENTORIES 


d 
apparent pleasure with which these activities were чаек ir 
finished. This is a promising if comparatively undeveloped 
interest discovery. "T" 

The third method is based on the assumption that the grea the 
amount of information. which an individual possesses in any koe le: 
greater will be his interest. 'The idea here is that if the student likes д 
tain area he will read more about it, work at it longer, and remem Here 
better than he does in those areas where no interest is present. 


А е : interest 
again the opportunity for &cquiring information rather than the inte 
might have been lacking. 


The most successful procedur 
dents and adults has been th. i 


ve 
n the interests of bright and p i 
een various social groups bY (a) 


А as 
JCtween engineers whose work w 
and, (b) social in 


re 
sa f 10 scoring. The number of deg h 
pos varied from yes v Ga “a сан) ihe 
e ve x * їка islike 
much) to a scale AW "i a Mike, Dot decided, dislike, dish. 


MEASUREMENT OF INTEREST 427 


à result to have certain common interests which can be tapped. The 


procedure of construction consists of selecting a large number of items 
which will differentiate between “men in general” and men successful 
in a certain occupation. Items which do not distinguish between these 
two groups are thrown out. These items are then weighted in scoring in 
Proportion to the degree of completeness with which they separate these 


two groups. 
‚ Let us illustrate from Strong's blan 
item “actor” for personnel managers t 


k. In computing the score on the 
he following procedure was used: 


Per Cent 
I D 
+38 —13 
+35 —27 
DRE n accom un АРЕ E E E 


tactor?.... a ee 
e whereby a difference of 8 to 11 was given 
weight of 1, and one of 12 to 15 a weight 


Final weights for item of ' 


a Strong worked out a schem 
Weight of 2, one of 3 to Та 
of 3, 

Strong Vocational Interest 


» constituting the | 
ji divided into the following 


Form M, as at presen : 
d of 400 items 


Blank for Men, is compose 


Parts: 
Number of Items 


I. congullndian rer i em sc we E 
р ретти ages m e 
III. Amusements...«:507777 
IV. Activities... 777^ 
V. Peculiarities of people 
VI. Order of preference of а 
um Comparison of interés? | 
I. Rating of present abilities а! 
ФАЙ. ук дн ani Же уи i 3b 
а Now possible to score these 400 kr differently for ar 
; Parate ; ch occupat tion. The num- 
pë key RN: pedir differently Or E. iai m baec 
it Pipes үе Т j these samples vr ICA P retais to 513 
13 repr ons included ys S apation of YM hne Lice 
айе, esentatives of t AS CC ations the num er E mir n 
280 ür s. In 13 out of the 35 0 p from architect o ж hem 
State ое, These occupation o mathematician to p E o 
‘o Procedures i ane d dae ing orale o also been devel- 
9 that "pow also provide ta bs rod Scales € (0) de ic 
Peq озщ would not ae (1) maturity of interests, 
curing measures 


428 PERSONALITY INVENTORIES 


level, (3) self-confidence and sociability, (4) social adjustment, (9) 
scholastic aptitude, and (6) theoretic and economic evaluative attitu - i 
The reliability, validity, norms, and manual have all been carefully 


lower. One study (Carter, Canning, 
a high school boy receives a ‘C’ rat 
per cent chance that he w 


€ interests of “теп in geneti 
tain that they really represe? 


ers 
ple, only the successful memb 


MEASUREMENT OF INTEREST 429 


tha: 7 

ie pier гта the clinical use of the blank has borne out this con- 

Noni fs one of the validity of this inventory. 

Np е] іпуепіогу have been worked out in letter scores, stand- 

йй the кн | p Sample tables of distribution are furnished 

кй a a : . һе most practical one of these measures is the letter 

Momus ject s interests agree pretty largely with those of a certain 

ees n he receives an А in that occupation. Technically, if a sub- 
nterest score is not lower than 0.5 sigma below the average of an 


Occu Е A 3 
Ма then he receives an А. This amounts to 69 per cent of the 
scores in that occupation. If he falls in the next 29 per cent 
ves a B or B—. In the lowest 2 per 


of th н 
беп "5 occupation's score, he recei 
e receives a C. Ап individual who scores 2 C in any occupation 


has no real i H : 
Karte interest in that occupation, or no more than “men in 
al." The method of scoring, reliability, validity, and norms are all 


тац explained in the manual. 
in th нр: has also issued а Vocational Interest Blank for Women built 
Which — way as the blank for men. It contains 400 items, 263 of 
in 1951 po same as those contained in the blank for men. There were 
ie occupational scales ready for use varying from "artist" and 
cul to “teaching physical education in high school" and * YWCA 
e: Secretary." Reliabilities, validities, and norms are developed 
ies similar to those for the men. 
ы, ES the great difficulties with this inventory is the time it takes to 
- If scored by hand, even by an expert, it takes 5 to 10 hours to 


Sco 7 3 Н : 
re the 39 different occupations. It is also expensive to have the 
the central office. Some experimenters 


g by weighting the answers 1, 0, —1in 
ень. sent scheme in which the score for an 
to —4 ranges from 4 through zero to —4 and for a D-score from 4-4 
. Strong holds that this procedure makes the results а little less 

lace, a liking or 


reli; 
: able and hence will have none of it.! In the second p 
ficially acquired. It may be based on 


d hence the generalization may be specious, 
Jack of information about the activities in ques- 


e an attempt on the part of the subject to 
or these reasons no one should fill 


d about arriving at а knowl- 


ri 

m may be due to à 
Prey or there may even b 
out wei about his real interest. F 
°Чде € blank who is not seriously concerne 
of what his real interests аге. 
x Weighted 
Strong says (p. 
ling from У 


of Ed ucational 
nit scale 


sixth to 


vs Unit Scales," Journal 
215), “Оп such a basis u 


eighted scores in from one- 


1 
Wo Edward K., Jr» 
s chology (1945) 36:193-216- 


я Ores - 
D will lead to different couse 


e-t; 
twelth of the cases," 


430 PERSONALITY INVENTORIES 


'The Cleeton Vocational Interest Inventory approaches the propin 
of interest in a manner similar to that of Strong’s Vocational Interes 
Blank. It lists occupations, school subjects, characteristics of people, 
activities, and magazines and asks the subject to express his likes oF 
affirmations by placing a + after the item and his dislike or negation by 
placing a 0 in the same position. There are 670 items in all, groupe 
around nine occupational families and an introvert-extrovert dimension 
The areas or families of occupations are (1) physician, (2) life-insurance 

3 (4) teacher, minister, or social worker, (5) РЧ” 
chasing agent, (6) lawyer, (7) mechanical occupations, (8) accountan ‚ 


n 
еті, 


ri- 
"Store salesclerk, (3) nurse ог bact? 


5 дыл 
first 2 n à second administration Wit е5 
St marking of the ; 


were changed fr Nventory, 6.1% of the respo” igs 
e 


in | 
properly that if it "a 19 D. Or from ‘4? to “0?” The author ^ 
selected in at the ite inventory : 
occupations oe "i ШШЕ that they have IE for specifo 
whose basic occ eir validity is assured. He thus selected some wen 
well as some ee Significance had already been determine ác 
liess? Tin ease, oí лы ше of their agreement with those se 
Diels |... Persons “succesef in stan 
se! alg were analyzed in order to i is сн к” * 
e nine scales of the inventory, The Mrs ine their agree n 
1 Manual, pp. 20-21, 5 showed a high ав 


ho 


MEASUREMENT OF INTEREST 431 


bet А Р 
ween the occupation being followed and the corresponding scale 


Score: 
the highest inventory rating of each 


a Е : ч 
Fin with the occupation being followed in 76% of the cases 
Ж zi ty-two per cent rate either first or second on the inventory 
e corresponding to their occupation, and 95% rate first, second 
? 2 


or third in the corresponding scale.’ 


d 12, college freshmen, and adults are 


Among these 7424 persons, 


Norms for grad 
буллы grades 9, 10, 11, an 
E inventory has several strong points. It 
(6 A tunity to express his likes or dislikes abou 
) and a large variety of items. The inventory ї 
be machine-scored. 


May sj 

d simply count the number of plusses or it may 

po uu of directions is excellent, describing as it does the develop- 
and construction of the scale a$ well as the conditions under which 


t 
е тау be used most successfully. Many counselors would agree 
to go e determining of areas of interest is about as far as it is practical 
od with an inventory of occupations. But not all features of the in- 
lar hd are desirable. Many occupations are so much like one another 
ieee e subject may carry over his interest in one occupation to the 
of ee listed. Some critics have voiced their objections to the failure 
athe e author to describe his principle of classification whereby only 
areas are arrived at. It is indeed curious to place manicurist with 
ae tegory Or to place watchmaker under 
е cm sciences. There are no correlations computed between the 
We рз so that one cannot tell whether or not there is overlapping be- 
Say p them. Such a weakness should be rectified. In conclusion, we can 
топ hat this inventory is practical and useful and that it furnishes 
£hly the subject's area of occupational interests. 
Мае Preference Record is suitable from grades 9 
Prefe school and college students. Interest is expresse 
arra rence among three activities. The statements of t 
nged in groups of threes. The instructions are: 


gives the subject an 
t a very large number 
s easy to score. One 


to 16, i.c., for both 
d by indicating & 
hese activities are 


of each group- Decide which of the three activities 
the 1 beside 


You |; 

this like most. Note the letter in front of it and punch a hole through e 

35 letter in the column at the right, using the pin with which you are provided. 
en decide which activity you like least and punch а hole through the 3 beside the 


Огге; 5 i 
Sponding letter in the column at the right. 


Re 
ad over the three activities 


1 
Ма» 
» ~ anual, pp. 21— i 
» PP. 22. ience Research Associates, Chicago. 


Uotation and items by permission of 5с 


PERSONALITY INVENTORIES 
432 


The two following triplets will serve as examples: 


i (1) g (3) 
dy physics " 
: — mcr composition (1) h (3) 
i Study public speaking (1) i (3) 
Make a study of flower arrangement (0 r (8) 
т Make a study of mental ills (1) s (3) 


l. Make a study of propaganda methods 0) £ (3) 


interests: 
wed four outstanding types of rr rim th 
science, (2) language, (3) People, and (4) business. We mig 


ide science. the liter?! 
ries place the Scientific area beside science, 
area beside language, the 


ide 
ч Б ч ras besic T 
Social-service and persuasive areas st in bu?! 
interest in people, and the computational area beside intere 
ness. It is important ton 


ing 
easuri 
ote that these two instruments for m 


ds muc П 
n H H e to 

interest, developed in such different Ways, should have come t 
agreement as is here indicated, 


ine 
the n! 
This preference record is reliable, The reliability of each of 
divisions has been studied 
hi 


on tS: 
tuden 
with graduate Students, college $ d 
£h school Seniors, and 
women, boys and girls. T 


‚ еп 
even with grade 8, and with both m 
-90 and above, І 


€ category 
nd that of 


А F 
n and women. In the 1944 epe 0 

are furnished for Sophomore, Junior, and senior high schoo 

both boys and girls. These norms 


MEASUREMENT OF INTEREST 433 


e two sexes because of substantial sex differ 
computational, scientific, musical, artistic, 
s. Boys are clearly more interested in 
he last four. In the literary and per- 


анн eus 
а the differences are small. It is quite clear that data are 
с inii а gras an individual's preference record with others of 
Doves ees Ls 7 т ате continually being improved 
азна iim ew cases. Each new manual includes improved bases 
TA cr m resemblance exists between comparable areas of the 
For ud acm Record and the Strong Vocational Interest Blank. 
of e 1 Strong’s artist score correlates 56 with the artistic area 
echa i , 166); Strong S engineer score correlates .72 with the 
dema cal area and .54 with the scientific area of Kuder; Strong's 
ок Scores correlate ‚51 with the mechanical area and .73 with the 
E A area of Kuder. With Kuder's computational area Strong's 
late ber the accountant (С.Р.А.), purchasing agent, and banker corre- 
Semin an 38 and .49, while these same occupational areas of Strong 
Mises c e etween .36 and 62 with Kuder’s clerical interest.! While 
orrelations are substantial it is not possible to interchange their 


Scor = 3 ^ 
The Their categories are di, fierent and must be so considered. 
e newest of these interest inventories suitable for high school stu- 


is E the Occupational Interest Inventory by Edwin A. Lee and 
triad . Thorpe. The test consists of 120 paired items and 30 items of 
“Put, Two items follow from the 120 pairs m which the directions are: 
a circle around the letter preceding the activity you choose." 


бар separate norms for th 

an in the mechanical, 

ж -service, and clerical division 
rst three groups and girls in t 


19 р а 
Е Clip hedges and trim trees 


С i А 
Mix cement, or carry plaster or bricks 


cords 


36 р 
Check the accuracy of financial statements or ге 
ew machinery 


F Use scientific laws to develop n 
The i 2 š 

he instructions for the triads are “ You are to choose one of the three 
i : е around the letter pre- 


d collect the money for a paper route 
d collections in a large company 


ccounts an 
and salesmen's commissions 


Triggs, Frances Oralind, “А Further 
erence Record and the 
1 Research (1943-1944) 


10 
n Keep the accounts am 
" Manage the financial a 
А + Figure payrolls, salary rates, 
se, iin coefficients in this paragraph i 
tri parison of Interest Measurement by the Kuder Pref 
37:538 Vocational Interest Blank," Journal of Educationa 
5998-544 
2 л 
I ч 
tems by permission of Са 


are from 


lifornia Test Bureau; Los Angeles, Calif. 


434 PERSONALITY INVENTORIES 


i ; tional 
By scoring the first set of 120 items, interest scores of six occupati 
families may be obtained: 


: aw) 
l-social (domestic, personal, social services, teaching; la 
. Personal-soc ‚р A 


1 ^ : 3 = ring for 
2. Natural (farming, gardening, fishing, lumbering, ca 
animals) 


. Mechanical 
. Business 

. The arts 

. The sciences 


Ov CP w 


interest at a high level—su 
ble to score for levels of 
us to obtain three addit 
lative, and (3) computa 
-93 for each field of in 


aah 
| 's mechanica], 12; business with Kuder’s up h 
74; the sciences with Kuder's scientific :80; and computational : no 
Kuder’s computational, 50 1: The Lee-Thorpe inventory has little ret? 
correlation with intelligence-test Scores. The validation is incomP "y 
because it has not been applied to persons engaged in a large varie. og 
occupations. In short, the Lee-Thorpe inventory shows promise 0 bie 
a very useful instrument for purposes of interviewing and with = 
study may develop into a very valuable interest inventory att 
In considering which one of these four inventories to use, the d 
e 5а load deserves some weight. One investigator M™*" „о 
oe ia ie inn, Vocabulary load of seven inventories.2 Our y^ Je 
sev 
at the top of pete das, qd ve А Study are shown in t f 


100] grades than those of Cleeton ап em 

к... з єз а selected ligt of interest inventories. One 9 u Р 

hild as "unlap's Academic Preference Blank is suitable for УО" osts 

chidren and is constructed for the Purpose of discovering the i” t5 
1 These correlations are f 


of the Lee-Thorpe Occupa 
chology (1947) 38:353-362. 

? Roeber, Edward C., “A Com 
Word Usages,” Journal of Educati 


+, ДЕРС =. 
Теп, Непгу C., * A Study of Certa" | P9 
rest Inventory,” Journal of Educatie 


Parison of 


ct 

£ espe 
Seven Inventories with Ё 
onal Researg 


h (1948-49) 42:8-17. 


MEASUREMENT OF INTEREST 435 


Percentage of Different Words 


Inventory above the Level of Grade 9 


Lee-Thorpe 8.9-9.6 
Kuder....... 10.6 
Cleeton.....- 14.9 
Strong. «esent 16.1 


3 children in school subjects or areas 
eed Questionnaire for High Schoo 
та only three areas of interest: the academic, the technical, and 

е commercial. The others in the table have less value for our purposes. 
D 

IRECT OBSERVATION OF THE INTERESTS OF CHILDREN AND STUDENTS 
and corroborative technique rather 
f children, students, and adults are 
d. And yet there 


t imet observation is an ancillary 
m п à primary one. The motives 0 еп 
хз complex that their actions are easily misinterprete 
fnt some possibilities here. The author! checked the records of children's 
EIN obtained through а questionnaire by observing the children at 
| eir reading in public libraries. This was done (1) directly by observing 
t е books which the children freely selected, and (2) indirectly by rating 
t € blackness of the cards in the card catalogue а5 well as by recording 
t € number of books worn out. Inlike manner records can be kept of the 
Урез of plays and games which individuals like at different seasons. 

Necdotes also, if recorded at the time of the occurrence and accumu- 
ated from time to time, are valuable aids for discovering the range of 
Children’s interests." For examp hor once observed a group of 


Oys plan and construct а ra i ked for the money to 
ud the materials and constructed the tower themselves. The record 
Such experiences forms & capital illustration of the anecdotal record. 


NFORMATION 

is through measuring 
dividual and inferring 
s in it. On the one 


INTERESTS THROUGH X 


d in discovering interest 


A third procedure use з 
ssessed by an 1n. 


pt information about à topic possess’ 
9m his information the amount of interest he ha: On th 
te nd, you ask a subject to express his feeling toward an item 1n the 
tms of like, indifferent dislike L-I-D); on the other, you check his 
Actual information On the one hand, you ask him if he likes baseball; 
шь other, you uestion him to see if he knows what à “squeeze play 
А Ога “ fielder’s d oice." In the latter case you we that if he ss 
баны o roal t :n baseball he wou ave a very 
iere к, uenia АЫ would have liked baseball, would have 


: 7 7 ] Hill: The University 
of Jordan, A. М., Children’s Interests ™ Reading. Chapel 1 


* Jaraa Carolina Press, 1970 Handbook on the Anecdotal Beha: 


сы УЕ, L, Т, and Mark Bilin 
Саво: Unive. nic of Chicago frei 1040. 


vior Journal. 


| 


/ 


TABLE 16. INTEREST INVENTORIES (Continued) 


] 


| | Time, 
Name | Grade Types of scores | Reliability Validity Norms Publisher min- 
| | utes 
Garretson and 8-10 Academic .86 Biserial correlation Norms for Teachers College, 30 
Symonds Interest Commercial 925 between selecting each of three | Columbia 
Questionnaire for "Technical 953 one curriculum types of cur- | University 
High School | rather than another. riculum 
Students \ | | Predicted what cur- 
\ | | riculum a boy would 
| | | choose 
Dunlap Academic Interest in eight | .70—.83 |90 words or phrases |(1) Paragraph | World Book 15 
Preference Blank | areas of elemen- relate to special meaning, (2) | Company 
tary school academic areas. word mean- 
| \ Correlated with suc-| ing, (3) his- 
| \ \ cess in each area and | tory, (4) lan- 
| with general intelli- | guage usage, 
| | gence (5) geog- 
| raphy, (6) lit- 
| erature, (7) 
arithmetic, 
| | | (8) general 
| | | | achievement 
| = 
* Age: boys 10-16, men 16 and over, girls 10-16, women 16 and over. 


ISHUTLNI 40 ЪМЯЙЯЯП5УЯК 


Ley 


- TABLE 16. INTEREST INVENTORIES 
Д Time, 
Name Grade | Types of scores | Reliability Validity Norms Publisher кыз 
| utes 
Kuder Preference 9-16 (See text) Science Research 40-60 
Record | Associates 
Strong's Vocational | 10-16 .877 (See text) 39 occupations | Stanford University 40 
Interest Blank Press 
Lee-Thorpe Occupa- | 9-11 (See text) Norms for California ‘Test 30-40 
tional Interest and each type of | Bureau 
Inventory 12 up score 
Glaser-Maller 9-16 | Theoretic .91 Items kept which dis- | Norms for four | Teachers College, 30 
Interest Values Aesthetic .93 tinguished between types Columbia 
Inventory Social .92 the four types University 
Economic .87 | already known to be 
| present in students 
Cleeton Vocational | 9- 16 | (See text) | Norms for the | McKnight & 45-55 
Interest Inventory | nine types McKnight 
Brainard and Е Subject's interest .68 | No objective data on | Norms for Psychological 30-40 
Stewart Specific ina particular | .13-.94 reliability or validity | different Corporation 
Interest Inventory mode of expres- of inventory. Оцез- | modes of ex- | 
sing activity tion as to independ- | pressing 
| such as physical ence of separate di- | activity 


pression, 


expeti 


\ 


work, vocal ex- 


nting 


visions: outdoor, 

scientific, experi- | 
mentation, observa- 
tion. and creative 


, Waagimauon 


| 
| 


\ 


9h 


SATMOLNAANI ALITYNOSUAd 


438 PERSONALITY INVENTORIES 


d 
iscovere 
read the rule book, would have had a pleasant glow when ve pee volt 
a new idea, and would have talked about it with others an f associa- 
sequence remembered it well. On the other hand, as a result о к 


= > оріс m 
tion with friends there might be an accumulation of facts on a top 
which a person has little interest. 


The Information Test of Interests 


In the constructio: 
interest of subjects, sam 


е Verage, these coefficients have ZR est 
s in the area of validity that they show their £ 
Weakness, 
Validity th 
l e 
It is customary in establishing the validity of a test to compar om 
Iecords of one test with records obtained from another test 0 relat? 
ratings of competent persons, These objective interest tests do pe o 
E each other. The сое cients range from .57 to 70, with а uff e 
01.2 This Sin. Т not much below the intercorrelations of 3 5 
5. It is when these informati rele ie 
tests are cor 
estimates of Interests that their t а In one е ыз 
5 
е at rue validity is determined 


ds 

n scores miga between estimated occupations Ка | 
А теѕ wit! Н со . nl .15. jc 

gible relation § correlation of only n 


А paniy 
: : Experi i Мес?“ ye 
ndicated that potab vith the Army su 


indeed a negli 
Interest Test i 


1 Fryer, Douglas, The Measy 
Company, Inc., 1931. "emen?" Of Ty 


ШЕ, 

olt "at 
ilerests. New York: Henry H tre? 

аріег VIII i 

ment of the whole sub 


n ê 
ject, much of this book contains a compe lur t 
; more extended th in the pres 
In Fryer's text have been gathered the correlations * epe and V 
these objective interest tests, on reli 
? McHale, Kathryn, *An Informati 


ic 
jin 
" 
(1930) 19:53-58. 


ica 
9n Test of Interests,” Psycholog 


MEASUREMENT OF INTEREST 43 
9 


and estim i 
ated int i i 
This relati erests was a little higher th 
кечи gher than .15, perh 
Pani aci wis is a low one at best and indicates that жау а 
totes men A emphasizing different aspects of the a 
tion b of subjective interests the questi i +. 
bonne um the ques ion arises as to the rel; 
есь scores on tests of objective interest set 
c ы ' s and on mi 
hern а The average coefficient of correlation between Бол 
Assembly Te and scores on such measuring instruments as Sten e 
st or Stenquist Picture Tests, which are cinco a 


mechani 

© Ss 4 

al ability, was above AO. It is clear that interest and success 
fficiently unlike to demand different types 


interes 
Е t bei 1 
being measured. Т the interests are social, such a 
the correlation is marked. If, on 


In Ream’ 
the othe ^ йк Relations Test, then 
interests р the interest measure is an indicator of mechanical 
Cents xen e coefficient is much lower. In the former case the coeffi- 
Ormation р around .60; in the latter, around .40. Tests of general in- 
Measures gh since the advent of Army Alpha, stood up well as 
intelligen of intelligence. Their correlations with other indicators of 
Ment of се have been as high as most other forms used in the measure- 
st of deme mem It is thus indicated that a score on an objective 
nding es is measuring intelligence п part. This is an anticipated 
e BOUE of. the acquisition of information has long been recognized as 
Non of the joint influence of interest and intelligence. 
of obje these considerations it seems evident that the scores on à test 
that 4 Ctive interest are composed of much more than the mere interest 
et E v нена has in that area. Intelligence and past experience, 
re not я, interest ог not, ате additional factors. At any rate the scores 
ear-cut measures of interest. Because of this ambiguity in the 


ваті 
ng of the scores, objective tests of interests have never gained the 
easures have. Tf some technique could be 


pulari 
foung that subjective m 
the interest factor from the others these 


hd wr: 

ҮЧ which would separate 
]d forge rapidly to the front. Objective 
h tests of aptitude and 


Јес 
tests ae tests of interest wou y 
асе interest have tended to be merged wit 
t factors to be measured by the sub- 


M 
lective te ces leaving the interes 
S USES OF INTEREST INVENTORIES 
in ded from а well-administered and unprejudiced taking of interest 
n tories are of use in four areas: (1) they help an individual assess his 
ч an (2) they are use to the counselor and the pupil in 
ional guidance, (3) they help the student in 


his choice of an 


440 PERSONALITY INVENTORIES 


occupation, and (4) they aid the teacher in motivating and expanding 
the work of the classroom. 

In the first place, the study and discovery of a student's interests are 
valuable in his personal, educational, and occupational or vocation? 
development. The taking and study of such an interest inventory 45 ira 
been described in this chapter establishes in the individual the habi 
of studying his own personality traits objectively. He finds, for ехатр' 
that his real interests аге different from those advised by his parents ^ 


gs him face to face with his own wee 
а sobering thought. If the tence e 
1s strong and weak interests he may nis 
К out his most intimate interests: “ 


says in his manua]: that this es Sum total of happiness. AS ^, jpt? 


an occupation where he Sedi] Ocedure would help а student 8° and 


the greatest personal т к d “fewest personal handicaP 


In the fourth р] е 
Уч Place, man ^4 intel” 
within the classroom. Int Y uses can be made of children 5 movin’ 


€rests in radi ing, 2 h 
! Cleeton, Glen U., Manual of Directio a programs, tend ae 
р. 8. Bloomington, Ш.: McKnight апа Mir Cleeton Vocational Interest 


cKnight, 1943, 


MEASUREMENT OF INTEREST 441 


pictu 

lees E events of the day can be used along with inventoried 

interests аа children's learning. Projects based on such 

meanings and involving activities growing out of them may develop 
expand horizons which otherwise might have remained 


little 
understood and narrow. To а child interested in adventure such 
"ld are a godsend. It is well for 


Ooks 
tethers Treasure Island or Call of the W 
giestas to know the interests of their students, for in answering their 
Буве пѕ апі directing their activities а type of education may be 
ped which will continue long after the course is completed. 
D INTERESTS TO OTHER TRAITS 


RELA” 

iari OF INVENTORIE 

h 

urpis. amount of relationship of interests to (1) measures of achieve- 

Ness measures of general intelligence, and (3) measures of special 
es is of considerable importance 


, The ys 
things UN of correlation computed bet 
i or between achievement tests and interest is not high. In general 


15 р М ө Le H 
i uu a is represented by à coefficient of correlation between 
40. Garretson and Symonds! report no resemblance between 


Ommercia] j x 
Ommercial interests and commercial grades (r = .00), but a slightly 
n technical interests and grades in 


ween school marks and 


i 
echni, coefficient (r= .29) betwee Я 
al subjects. Correlations between the Kuder Preference Record 
sured by standard tests have been re- 
d that the coefficients 


n 

ae achievement as mca 2 
betw somewhat higher- One study* showe ә 
en interest in science and general i achievement was .42 
hile the corresponding coefficients between 
t in literature and achievement in literature was .33 for women 
бе 40 for men. Most of t hcients computed from Strong’s 
ational Interest Blank and achi ment in school have been done 
er than those here 


at 
e 
ported E" level and, in gener 
red intelligence and invent 


Th 
e А 
relation between measu : 
t between interest and achiev 
3 the reported correlation 
d intelligence was. 


Inventory an 


lwo 
i n 
Ntereg nen and .32 for men, W 
and 


al, are lig 
oried interests 
ement. In one 


between the 
29. When 


tesem. 
study les rather closely tha 
Ko (Kornhauser, 1929) 


Th 
auser General Interest 
for H igh School 


terest Questionnaire 
Univer- 


1 

S 

St Ymong, n E 

sip ents, Ne! P, M., and О. K. Gare ‘Teachers College, Columbia 
121939 ew Yorke Bureau of Poblleae™ 

Vay: lg к г p 

rio, S88, Е. ‹ lation of Kuder 

3.3409 O., “А Study of the Ке? iris 

:344 U8 › udy js chological 
М1-зр ther Measures,” Fducalional and Psychotos 


Record Scores to 


reference 
nt (1943) 


Measureme 


Dyer stati :onnaire on Likes and 
Sli}. nha " uantitatiVe Question 
T user, А. W., “Results from а 05 Journal of Applied Psychology (1929) 


11.568 Wi 
85.9 th a Group of College Fres 


442 PERSONALITY INVENTORIES 


Primary Mental Abilities! were correlated with the different interest 
scores (the Kuder Preference Record) the correlations when 512 UP 
versity freshmen were used as subjects were low with but one exception. 
Computational interest had a present but low correlation (.39) ™ 
number ability.? P 
Special aptitudes and scores from interest inventories are inclined E 
be only loosely related. In one extensive study, which included subjec 
from grade 7 to freshmen in college, scores from an interest invento? 
(Interest Analysis Blank for Boys) correlated from .00 to .35 with те 
ures of mechanical abilities. With the Minnesota Spatial Relations is 
the interest scores correlated from .09 to .30; and with the Minnes? 
Assembly Test and the Minnesota Paper Form Board the coefficie 
were no different. Finally, when the Mechanical Abilities Batter b 
ree к шн: with the Interest Analysis Blank for Boys bk 
00 to. 5. It is clear that one cannot depend on mechanical ? 
ests to predict tested mechanical abilities, 
d here, but much more from the * 
learly inferred that interests ате 80а? 
m interest scores neither achieveme? t Þe 
titude. Measures of these last MYS est? 
sts. Since these great areas of inte’. 


nd special aptitudes are separate, some Ме 


otal 
ate 


{5 

H + А es 
tion is ind h inf 
te 


1 Adkins, D. d 

Abilities and ы dal. o “The Relation between Primary К jt 
Sea бүре ; S, Psychometrik :251-262. А 

New York Hamer & pon PPraising Vocational м Gam хуш, 2 d 

ment of these matters. : These chapters give a much more comP 
3 Hubbard, R. M., “A Measu 

Psychology (1928) 35:229-252. “ment of Mechanical Interests,” nini 


K 


Р. G qi 


MEASUREMENT OF INTEREST 443 


PROFILE CHART SELF-APPRAISAL PROGRAM OF GUIDANCE IN THE. N ORTH_JUNIOR HIGH SCHOOL 


ри нме DE. N WI LLARD DATE or FIRST Eure, SET 1943... 


reswence_/ £46. 7/70 HD SE. nus MR. STITH |... 


CAREER PLANS: 1-__D RAFTS M AN 2-Б меу EER ____ 


TENTH GRADE SELECTIONS: SCHOOL “PARK Huici cugricuLum Mech Ants . 


5! 11 1B] verBac 
75 
READING 
ARTISTIC 


Оа 


Mus 


ЖАШ (ЕП 


ИЗО 
ЖАШАШ: 1 


vagge 


SIBI 


e 


сламата 1 x y 
m of Guidance in the Junior High Schoo 
Pinerintendent, School District of Philadelphia, 1947.) 


444 PERSONALITY INVENTORIES “ш 
i d. 
i t results is focuse БЫТ. 
А hom the purer light of test pod 
о рн testing process is nothing but a tinkling 
3 + 
ш == 35 may be interpreted as follows: 
i i cores П 
The first chart describes a boy with highest aptitude s 
number and in spatial thinking. 


а chani- 
His chief interests are 7 e little 
ific fi " h 
l, computational, and scientific fields. He seems to 
cal, 


Ш] as 
t this boy would be happy as аф 
е speaking and writing are not pa clearly 
Pp diu Asper ck 
nic, a technician in in ineer. 
"come a mechanical eae we 
т at present. One is а CU 


a e me- 
school that will eer 


i 
З ; choo 
the Standard Evening High yar 
units were needed for entrance into the eng 
college of his choice. 


SUMMARY 
Three techni 


iret 
1) dir 
tried out to discover interests: Сә e 
questioning, (2) observation and (3) Objective tests of infor 

these, the direct 


s aire t 
such shortcomings qen pe е а 
Proved to be both La iP" oud valid, Their scores: vary in 
from an Interest score In а well-k. 


in 
Дап 
, Pears in Sejf- appraisal Program of Gut 
Junior High School. Schoo) D elphia, 1947. 


MEASUREMENT OF INTEREST 


[pe aet only as evidence Sup 
Жош er techniques. The objective 
appear to be the most promisin, 


des 1 

С: 

ribed. Howev ; 1 infor 
and achiev 


table. Promising beginni 
ave been п 
interest hav 
the classro 
ina 
Та the area 
seful in getting a willing subject to 


1 : 
"s ae both intelligence 
M aee are thus far insurmoun 
E. sim testing of interests h 
S 4% of these measures of 
linn: een found useful in aiding 
sts already present as well as 


Stud. 
AD select a program of studies. 
lese interest scores have proved u 


view Seas ^ 
objectively the types of inter! 
ory encourages à su 


interests. Finally, 
he field of occupations W 
le, that his intere 
ts the vocations 


ee scoring such an invent 
улеш attitude toward his own 
might Е aids in narrowing t 
ical, а ge He finds, for examp 

act which definitely limi 


QUESTI 


1. р 
Dose Te is it said that the main pur 
Or gui the measurement of interests is 

Мише 
telae are motives and interests 
? How different? 
Cipal ee and evaluate three РГ 
nt ethods used in the discovery 9 

x m 
“Ж һаз not the amount of in- 
mont ^ in any area reflected the 
1 Whe interest present? 
errors t at are three principal sour 
: terest q be considered in using he 

questionnaire? 
Valgo E the process 
ng interest questionnai 


in- 


ces of 


used in 
res. 


tures of 


` 445 


plementary and confirmatory to 
tests of interests at first glance 
g of all the techniques thus far 


mation in any form is closely corre- 


ement, difficulties have arisen 
ngs in this area 
ever quite fulfilled. 

e been widespread. They 
om teacher to direct the 
ssisting the counselor to help the 
of vocational counseling 


he actually possesses. 
bject to assume an 
the taking of such 
hich the student 
sts are clearly mechan- 
to be considered. 


ests which 


ONS AND EXERCISES 


and (d) the Lee-Thorpe Occupational 
Interest Inventory. 

8. Explain precisely how these in- 
ventories can be used by the teacher 
and by the counselor. 

9. a. Makea table which includes the 
divisions of interest obtained from 
scoring (1) Cleeton’s Vocational Interest 
Inventory, (2) Kuder Preference Rec- 
ord, and (3) Lee-Thorpe Occupational 
Interest Inventory. 

b. Which seems to you the most 
useful arrangement? Why? 

10. What are the conclusions concern- 
ing the permanence of interest in (a) the 
elementary school, (д) the high school, 


and (c) the college? 
s to which interest 


а 1 What ar i 
@ Соп deus = Blank, it Discuss es use: н 
огу on's Vocational Interest Inven- inventories may be put- 
› (c) the Kuder Preference Record, 
BIBLIOGRAPHY 
Chap. XV. New 


Books 
Fn 
Puer Dovcras: The Measurement 
Gye New York: Henry 
mpany, Inc., 1931. 


REE 
ENE, Epwarp B.: Measurements 


of Human Behavior, 
Press, Inc., 1941. 


Reading. 


of North Carolina Press, 1926. 


446 PERSONALITY 
Н. H., and N. І. Gace: 
eme ee . M vasuremeni and Evalua- 
tion, pp. 407-425. New York: Harper & 
1943. 
en EuGENE R., RALPH W. TYLER, 
et al.: Appraising and Recording Student 
Progress, pp. 358-402. New York: 
Harper & Brothers, 1942. к 
5ткохс, Е. K., JR.: Vocational Inter- 
ests of Men and Women. Stanford 
University, Calif.: Stanford University 
Press, 1943. 
Super, Dovan E.: А bpraising Voca- 
tional Fitness, Chaps. XVI, XVII, 
XVIII. New York: Harper & Brothers, 
1949. 


Articles in J, ournals, Manuals 


ADKINS, D. C., and 
“The Relation between Primary Mental 
Abilities and Activity Preferences,” 
Psychometrika (1940) 5:251-262. 

CANNING, L. B h TAD 
Taytor, and H. D 


С. Е. Kuper: 


of Educa- 
41) 32:487-493. 

CARTER, Н. D, K. V, р. TAYLOR, and 

“Vocational Choices 

»cores of High School 

of Psychology (1941) 


FRANSDEN, ARDEN: 


E “ Appraisal of 
Interest in Guidance,” ond 0 
ucational Research 1945- 
rr (1945, 1946) 


Новвлвр, В, M.: « 
Mechanical Interests,” 


іс Psychology (1928) 35 
Janvig, L 


Measurement of 
ournal of Genet- 


Press, 1940, 
Kornuauser, А. W.: « 


Results from a 
Quantitative Questionnai 


re on Likes and 


INVENTORIES 


sh- 
Dislikes with a Group of College Po 
men," Journal of Applied Psy 

1:85-94. ийет 
к 25 G. F.: Manual to the ро. 
Preference Record. Chicago: Scie 
Search Associates, 1939, 1946. study of 

Linpcren, Henry C.: “А > Thorpe 
Certain Aspects of the ге 
Occupational Interest Inventory, 
nal of Educational Psychology 
38:353-362. 

LE, KATHRYN: | 
Pep of Interests,” Psychol 
Clinic (1930) 19:53-58. 

RorsER, Epwarp C.: ries W 
Son of Seven Interest — Р. f 
Respect to Word Usage, 948-1949) 
Educational Research (1 Е 
42:8-17. :danct ®ї 

Self-appraisal Program of ох of 
the Junior High School. Di 
Philadelphia, 1947. 

STRONG, Epwarp K , Jn: sducation® 
vs. Unit Scores,” Journal еа 
Psychology (1945) 36:193-21 "ox C 

Traxter, A. E., and WILE дег 
McCarr: “Some Data on d and 
Preference Record," Educatio! 194) 
Psychological Measurement 
1:253-268, ет stud? 

Triccs, FRANCES ORALIND Pe ne 
of the Relation of Kuder iher 
Record Scores to Various O a 
ures,” Educational and pei 
Measurement (1943) 3:341-354- 
: “A Further aor 
Interest Measurement by th 
Preference Record and 
Vocational Interest Blank zi 
Journal of Educational Resca 
1944) 37:538, 544; also 
38:193-200. 5 

WITTENBORN, J. R., Reis Е 
ND Triccs, and DANIEL P- оази 
d Comparison of Interest ze 
ment by the Kuder Preferen 
and the Strong Vocational | 
Blanks for Men and Women 
tional and Psychological 4 
(1943) 3:239—257. 


тта» 
“Ап pee 


ented 
“ weight¢ 1 


riso? 


CHAPTER 17 
Measurement of Attitudes 


rmine pretty largely the direction of 
attitudes affect action. In the 
learn all about the evil effects 
f it. If, however, an emotion- 
favor of it, action follows 


Attitudes and interests dete 
behavior. Even more than knowledge, 
realm of alcoholic consumption, persons 
of alcohol and then drink large quantities о 
ally toned attitude is built up against it or in 
much more certainly. In a great many areas of life is this true. In the 
fields of government, economics, labor relations, taxation for schools, 
Militarism, internationalism, race relations, social relations, and in 


Many other relations, attitudes play а dominant part in determining 
so important, why should they 


action. If, then, attitudes of adults are 
Not be of the greatest importance in the schools? The answer is, of 
Course, that they are and that definite evidence of their development 
Should be made available. 
Measurement, if well developed, could help in providing attested 
evidence of the presence of desired amounts of an attitude if the attitude 
lad already been carefully described as one of the outcomes of instruc- 
tion. Unfortunately agreed-upon lists of attitudes desirable for attain- 
Ment in school have not been made, and as a result, development of 
Measuring scales and instruments directly useful in the school situation 
аз been delayed. Another cause for the confusion in this area has been 
the variety of definitions of attitudes developed by competent psy- 
“Aologists. In one case psychologists define an attitude in rather general 
erms as “а more or less emotionalized tendency organized through ex- 
жашы to react positively Ог negatively s e or wet hk 
nological np emmers and Gage). ere all a 1 
Ћуојуе sies cud ps or against ® psychological object. А psycho- 


. + + . DEC One 
Bical ob; e which aroused reactions in individuals. 
кечеш t be а latent tendency such as the one to 


can readi ee 
be Sp Sib Nt nen aid those in distress, but it also a 
h бап а belief in some movement—for example, that js um 
О, Sing or 4 [оп taken in regard to the democratic А > 
спе other f a posit t definition must not be uia e : ust be 
Banized e of this ies ttitudes as We genera у stu y their 
ас isition rough ехрепе M and organized through experience, 
are certainly 16 aT 


448 PERSONALITY INVENTORIES 


А vard 

Li hite children in the South are not born with attitudes toy 

с нен but gather them from their personal experience. titude is 
а ne ii d one or two other definitions of attitudes. Ап poets 

e -— dis osition to act toward an object according to its С In this 
м е di p we are acquainted with them" (Woodworth). шше 
гата "set" or "disposition" substitutes for R ome as we 
н It too, emphasizes environment in the phrase s chological 
are acquainted with them.” “Object” would also be AA enduring 
object. A second definition also commands our attention: 
acquired predisposition to react in a cha. 
ably, or unfavorably, toward a 
or ideal" (Dashiell). N 
Unless the experience 


i 
racteristic way, usually TE 
given type of person, object, іп 
otice the emphasis here on the word “еп an atti- 
were enduring it would hardly be called motion: 
tude. Otherwise we would think of it as a mood or a temporary € rs an 

“Predisposition” here corresponds to “tendency” in Remme - 


5 
"т " 5% ie s 38. pa 
Gage definition and to “set” or "disposition" in Woodworth 

iell’s “favorably or unfavora 


еба” 
bly” corresponds to “positively ОГ P S 
tively" of Remmers and Gage. leadin£ 
Out of these definitions and their discussion come some of the 
characteristics of an attitude: d scribed 
1. An attitude is essentially a set or disposition which is also de 
ency. yor 
YS a feeling tone to act favorably or шов 
vely toward an object. 
sult of experience, "TE, 
4. The set or disposition is directed toward some psychologic’ 
tion, an institution, a race, or an ideal. 


gjet 


Attitudes are learned much 
attitude is acquired th iy 8 
wishing to be manly tak of tobacco. He is made death А for 
and as a result forms а ч 


the? e 
9 be well formed. Now and ubt! 


MEASUREMENT OF ATTITUDES 449 


ways that it is difficult to describe. Note the attitude of a child toward 
labor unions if he has been reared in a home of a manager of a large 
business. Again, the attitude formed may simply be produced by a 
Process of integration, as when à student who has failed one foreign 
language dislikes all of them. The case of a boy comes to mind who 
learned to dislike teachers in general because his music teacher lost her 
temper to such an extent that she slapped him in the face. This last 
example had rather ludicrous repercussions because the lad in question 
organized his comrades to sing loudly and lustily offkey whenever his 
teacher wanted them to sing especially well. An 8-year-old white boy 
living on a farm admired greatly a Negro carpenter who used to come 
over to the home place to build a shed, repair a roof, or mend whatever 
Was broken. The boy used to assist the carpenter and enjoyed thor- 
oughly the days when the carpenter came. He even called the man “Мт. 
avage.” His elders, on hearing the boy зау “Мг. Savage,” said, “You 
mustn't call him ‘Mister.’ He is a nigger." This was said with such 
emphasis that the lad knew his mistake must not be repeated. These 
earning processes are worthy of consideration because they apply both 
when attitudes are to be learned and when they are to be changed. 
MEASUREMENT Is CONCERNED 


Since attitudes may be developed toward almost any object, indi- 
Vidual, institution, or race it would be manifestly impossible to develop 
Measuring scales for all of them. The reasonable m seems, € 

i th instruction an 
9 select some of the most far-reaching ones for bo | 
Measurement, There is some danger here of making the categories so 


i . One investigator! 
тоа ' : A lications are blurred i 
d that their specific арр ects an unselected population regarded 


attem ; what obj : à 
as so, Km есй, He found 238 objects which two psychologists 
Classified into eight categories: 
+ Personality 
2. Education 
3. Economic activities 
4. Family 
3. Government 
E Social problems 
3 Recreation and exerc 
- Religion 
Itis Clear that such broad areas mistn a 7 
; early defined units before they could possi? 
Ucative process. МИРТ: 
1 -" А 
Horn «Socially Significant ^^ 417-126. 
Education, Е Зы Eo Y due University (1936) 5117 1 


ATTITUDES WITH WHICH 


jse 
oken down into smaller more 
be of much value for the 


tude Objects," Studies in Higher 


450 PERSONALITY INVENTORIES 


i ; ut 
A second attempt to list some of the educable attitudes holds 0 
more promise of success. 


ments for measuring attitudes. 


C Md dis- 
y will give concreteness to the 


er families should be put in th etter 
zm it is 
for the teacher to deci оогег families, and (3) whether it i i 
cide wh ied i tud 
to plan their work themselve ^s to be studied in class or for the § 


objective an individual ng how far along the roa * ду 

a c 
defined the measurement spen til the attitude o M at 
stage. attitudes must remain їп the exper 


MEASUREMENT OF ATTITUDES 
451 


extant scal 
es those scal hi 
school. T es w ich may be of value t 
ypes of scales, inventories, and other d ima a of the 
ill now be 


Presented and evaluated. 


In m 1 wW 
eas i i Vi 
urng an attitude, the ideal situation ould be to h 
ave a 


series of u А 

Е nambiguous st 

ranging f g statements placed at equal interval: 

men: ied li onm Hee to absolute disapproval b nr 

\ chosen because it ех d ital ‘i 
a gie xpressed clearly and certa. 

Bis ies on the scale. A person wishing to discover his ach at ra 

the nate c heck the items or statements with which he agreed oe 

ing hi onal points and divide this sum by their number, th А d 
"This position on the scale Pp um 

151 s е 

tat baie of equal units has not been attained. The nearest approach 

e Thurstone scales constructed upon the principle rem 

hey appear equal to the 


арреа В c 
ан units. The units are equal because t 
nt persons who sorted the statements into defined piles. In 


Constructi 
and ee the scale on the attitude toward the church (Thurstone 
ve, 1929) 130 statements were collected which reflected varying 


egree R 
s of friendliness or unfriendliness toward the church. These state- 
by 300 sorters. Eleven master slips, 


desi 
Nar BM A to K, were placed upon à table upon which the statements 
etters As placed by the sorters. The positions of three of the master 
Pressed e defined. In ; € tatements which ex- 
jr ighest appreciation n Pile K, those statements 
only re nen the stronges the church; and in Pile F, 
га] expressions. Thi eries were left 
ts from being 


of the church; i 
t depreciation of 
e remaining letters in the sı 


те used to prevent statemen 
ch: “І have seen 


und 
ps Certain criteria we 
o Yun. Let us take one statement abou 
е in the church." If this statement had been placed at F, G, H, 
with the largest number at H, 


let oy by a substantial number of sorters 
been EN ] it would not have been accepte . It would have 
о аезоше Thurstone checked this matter of ambiguity by sub- 
si tatis the 25th percentile from the 75th percentile. A much better 
ty 2101 would arise did the great majority of sorters place the item 
this S a very small number placing itat Gor К. As a matter of fact 
Just M enne item was not at all amb: laced at 9.9, 
rangi out at H. In this manner 2 series of 5 as drawn up 
bel Dg from A to K or from 1 to 1 and expressing different degrees of 
belief to extreme disbelief, the state- 


lef i 
n the church from extreme à 
intervals. 
ism. This scale 
in which he 


Ents be; 
m шы, placed at equal-appeariné 
1S Com e manner was constructed the scale of Commun 
Бене, Paratively easy to use One simply checks the items 
be th es. These items, altho ranged irregularly on this page, may 
Ought of as arranged in а series according to their scale values. The 


PERSONALITY INVENTORIES 
452 


Form A,! 
к TOWARD COMMUNISM, SCALE No. 6, For ; 
dina (Prepared by L. L. Thurstone) " 
Put a check mark ( V) if you agree with the — 
? i i zi tateme 
( disagree with the s d 
ta cross (X) if you c MS ud 
h = and the benefits of communism are greatly exagg 
я H 
A world must be converted to UN амын - disi 
š i we sho 
i i uch more radical change than 
. Communism is a m 


Sem. 
ommunis: 
Give Russia another twenty years or so and you'll see that c 
. Give 
made to work. 


9. Communism should be established by force if necessary. 


mmu- 
H that со 
1. I am not worrying, for I don't think there's the slightest chance 
Tis ] 
nism will be adopted here. 


sw 


an be 


13. 
15. 
Tf. 
19. 


i i ms. 
Communism is the solution to our present economic proble 
The ideals of communism are worth working for. 
The whole communistic scheme is unsound. 


ENG " er trial. 
We should not reject communism until it has been given a long 


he best 
median or average score of the items checked is then used jdn ae 0 
representative of the subjects position. One may also rine n repre 
scores as another measure and the one statement which most n ad 
mts his position as the third measure. cse 

T There on many more scales constructed by Thurstone nen, re 
under his leadership which are of great interest to school peoP 
are 17 scales of interest to high school teachers: 

1. War (D. D. Droba) 

2. The Negro (Е. D. Hinckley) 

3. The law (Р. Katz) 

4. The Germans (R. C. Peterson) mät L 

5. The Constitution of the United States (A. C. Rosander 
Thurstone) 


6. Prohibition (H. H. Smith and L. 


7. Communism (L. L. Thur 
8.M 


9. Freedom of sp 
10. Honesty in p 
11. i 
12. Unions (L. L. Th 
13. The treatment of сг 


) 
4 eston? 
iminals (C, К. A. Wang and L. L. т 
14. The movies (I.T. Thurstone) 
15. German war guilt (L. L. Thurstone) 
16. Divorce (L. L. Thurstone) 
17. The Chinese (R. C. Pet 


5 
erson) 
No one doubts that the att 


L. Thurstone) 


a 
are 
i ey 5$ 
10де scales are good or that i weak? 
rigorously constructed as any known at the present time. On 


! By permission of University of Chicago Press, 


MEASUREMENT OF ATTITUDES 453 


des is desired toward hundreds of 


psychological objects. 'To construct single scales for each such object 
would require more work than can be afforded. Is it not possible to 
develop a sort of general scale which could be used toward several 
objects under certain conditions? 

А Let us first consider the Bogardus Scale of Social Distance, which is 
in the form of a rating scale. It indicates the degree of closeness to which 
an individual is willing to admit members of another race. The scale is 
аз follows: (1) to close kinship by marriage, (2) to my club as personal 
chums, (3) to my street as neighbors, (4) to employment in my occupa- 
tion in my country, (5) to citizenship in my country, (6) as visitors only 
in my country, (7) would exclude from my country. These headings may 
appear at the top of a page and under each heading there may be 


written the appropriate number, such as: 


appears. The registering of attitu 


Canadians 1 2 3 4 5 6 7 

Germans 1 2 3 4 5 6 7 
nationality may be expressed by simply 
ile this is only & rough rating scale, it does 
Show the general attitude toward a race with some degree of consistency. 
The scale may also be used to express attitudes toward various religious 
lists, or Communists. It has 


denominations, liberals, agnostics, Socia n 
the advantage of ease of administration and of showing the general 


attitude of the rater. It lacks the precision of the Thurstone techniques 
Just now described or even of the master scales whose consideration is 


Now entered upon. , 
A second attempt to provide a more general scale appears in the 
emmers Master Attitude Scales! which were developed according to 

the equal-appearing units of Thurstone. There 15 По difference in the 

Principle of enuntedidtion. The difference lies pretty largely in the 

Eenerality of the statements. To be satisfactory à scale for any nation- 

ality would have to be stated very proadly so as to include Russian and 


The attitude toward each 
encircling one number. Wh 


erman, French and ugoslavs, etc- К ! 
Suppose oe two of these scales which might be used for the 
svn n of attitude toward the 


blem be the expressio i 
а Hinckley’s Attitude toward the Negro, 


orm A,’ constructed after the principles of Thurstone, » oo 
"se Grice's Generalized Scale Designed to Measure Attitudes owar 


¢fined Groups constructed according to the principles of Remmers. 
› 


ж purpose. Let the pro 
Negro race. We might UP 


L. Gage Educational Measurement and Evaluation 
; , 


1 
Кетте Т.Н d N 
i ma, Н. Ha OP es 106^ 


N 
са York: Harper & Brothe 


PERSONALITY INVENTORIES 
454 


+ toys: Attitude 
following are 7 of the 16 statements from Hinckley’s 
The follo 
toward the Negro. 


ere 
" " one of m 
) 1. The difference between the black and white races is not 
( : 5 
e, but of kind. 

3 peus Should hold an office of trust, honor or profit. Е 
C» 8. Inherently, the Negro and the white man are equal. Jers dooms the 
l ; 9. The inability of the Negroes to develop outstanding leade 4 
( 2 to a low place in society, Реч the 
After you have educated the Negro to the level of the v 
will still be an impassable gulf between them. 


nest 
А : сотто! 
( ) 13. The Negro is by no means fit for social equality with the 
white people. 


(rats 


( ) 15. It is possible for the white and N. 


without becoming brothers- 


ized 
ТЕГ eralize 
The following are the odd-numbered items from Grice's Gen por 
Scale Designed to Measure Attitudes toward Defined Groups, 
(scale value in parenthesis) : 
1. Show a high rate of efficiency in anything they attempt (10.9) 
. Are mentally strong (10.0) 
- Are very patriotic (9.8) 
7. Are noted for their industry (9.3) 
9. Are a tactful group of people (9.0) 
11. I would be willing to trust these people (8.8) 
13. Command the respect of any group (8.5) 
15. Are of a selí-sacrificing nature (8.2) 
17. Should be permitted to enter the Country as immigrants (8.0) 
19. Area God-fearing group (7.7) 
21. Are highly emoti 


onal (6.0) 
Te superstitious (4.6) 


25. Are unimaginative (4.2) (3.2) 
- So far as I am Concerned this group can stay in their native country 

» Are frivolous (3.0) 

- Tam not in Sympathy with these People (2.5) 

- Are tactless (2.3) 


Are despised by the better groups (1.9) 
elong to a low Social leve] (1.6) 
39. Should not be i 


ist 
Chri 
in 
egro races to be brothers 
in-law. 


d be deported from this country (1.2) 
te force (.9) 
T€ our worst citizens (.7) 


MEASUREMENT OF ATTITUDES 455 
ge бав simply checks the statements 
ia ale va ue from another sheet, and 
carie values. In the Grice scale the statements are аттап ses 
nee y favorable to extremely unfavorable. The scale e ye 
ps iately before the user. The median score of the items ШОТ is 
E o make the scoring very simple and rapid. The general scale Чо 
a omething of the concreteness of the particular scale. The general 
€ , moreover, has items which simply are not applicable in some 
s such as No. 17 which refers to immigration, which is not a problem 


m ze case of Negroes. 

ipe all the proof of the pudding is in the eating. If these scales 

"ps ish instruments which register faithfully a subject’s attitude 

a а race they are valuable whatever deadwood they contain 
ce applied the general scale and two of Thurstone’s particular ке И 


m problem of obtaining attitude toward the Negroes and the 
А "Thurstone's scale reliabilities varied around .87; that of the 
E scale was .84. When the records from the general scale were 
Сеш coy with the records from the particular scale the coefficients ran 
ар to T5. Thus the evidence 1s in favor of using either test to 
Eres an attitude. Tf this is true 1n general, there is a great gain in 
рае. in using the general test because it can be used in a great many 
scale ай Consider Miller's scale on which one could express on one 
le his attitude toward any of the 25,000 vocations listed in the 
o States. Remmers and his students have constructed 11 scales, 
ks wn as Master Attitude Scales, by means of which a subject may 
asure his attitude toward the following: 
. Any disciplinary procedure (V. R. Clause) 
. Any elementary teacher (M. Amatara) 
. Any home making activity (B. K- Vogel) 


oe unfavorable m 
elieved in, obtains their sca 


‚ Any play (M. Dimmit) 
A (H. W. E xcti 

Fa ad goci 1 action (7. ^7 omas 

ds ipe d group (Н. Н. Grice) 


1 
2 
3 
4 
5. Any practice 
б, 
7 
8. Any racial or nationa 
9 Any school subject (Е. В 

ion (I. В. Kelly) 
or beliefs is the Test of 
esting because they grew 
] students and represent 


el al., Appraising 
& Brothers, 1942, 


elief ther attempt at measuring анаа 
$ on Social Issues.” These are mos att 
jences 9 high schoo 
alph W. Tyler, 


чр 

o 

1 E of the school exper! 

ang p cedure described i Smith Eugene "^ Ж 

Чел, ccording ad ИЕ рр. 209-229. New York: Harper 
Y permission of Harper & Brothers- 


456 PERSONALITY INVENTORIES 


h 
toan extent their own experiences. It is reported: “ In several — 
students and parents as well as teachers participated in this ie ае ó 
—samples of student writing were analyzed, as were their x for the 
‘research’ topics and free reading." Forms were developed bot uction 
senior high school and for the junior high school level. The oar " 
of this test differed from either the Thurstone or Remmers s llow- 
The instrument consists of 200 statements “classified under the onem- 
ing areas of issues: democracy, economic relations, labor and 
ployment, race, nationalism, and militarism.” 


3 ncet- 
Students respond to each issue by agreement, disagreement, or Ч 
tainty. “The statements are arran; 


е 

ged in random order and are at 
to the students in two sections given at different times. For each : те- 
ment in the first section there is a statement in the second section nn 
senting the opposite point of view." Two illustrations of the кү! r 
opposite items are phrased and of how they appear in different ү 
will be presented. The first pair deals with labor and unemploy pes 
4.21 14. 


в erio 
Most workers who are unable to provide for themselves during : Fath 
of unemployment have been too shiftless to save. Agree, Dis* 
Uncertain. 


104. The w 


> 
ww 


ment. Agree, Disagree, 


The second pair deals with nationalism: 


ign 
А reig" 
endis ч шы Ought to protect American business interests i agree 
untries even if it involves usin, Agree М 
о һы , 
Uncertain. § our army and navy. Ag 
4.31 


-— 

i i pus! 
ould not risk a war to protect American 
countries, Agree, Disagree, Uncertain. 


5 
иде 
Кефе їз ж checked by measuring the cia the 
< test against the Opinion . regardi 
ttitudes, Further, р 5 of the teachers gar д ай 


ў ;u 

lent of reliability was 95 ^ hav? 

- Unfortunately these sca ; 
e 

R tu 

Pt to discover the ri th of att! ре 

tp", 

toward the Negro was made through th rise and grow Panoge”, 

« 8h the use of pictures. 
1 Hartley, Eugene L., “The Dey, 


elopment of Attitude toward the 
Archives of Psychology (1936) No. 194, p. 47, Items by € 


Nest” 


MEASUREMENT OF ATTITUDES 457 


The first two tests might be answered 
ig. 36. These pictures were judged by 
d much experience with both Negroes 
d typical of the races studied and 


three tests were constructed. 
directly from the pictures in F 
competent persons who had ha 
and whites to be both pleasing an 


By permission 
New York.) 


Lure attitudes toward Negroes. 
d with Harper & Brothers, 
yhite. Jt will be noticed also that some 
lt the white faces while others are 
d by boys.) The posi- 
s a chance one. The 
dergarten through 
re, which in the 


Fig ups 
of p. 36. Photographs used to meas 


"gene L, Hartley and by arrange 


Unean: 
of iis aiy either Negro thañ 
a egro faces are lighter а j 

toker, (All pictures were 0 boys and жыш i 

е 9f the white picture among the three * T ti 
test el children from the 57 

Brade 8 s administered oe directl from this pictu 

° were to be answere 2 


is 
Binal was about 10 by 1072 


458 PERSONALITY INVENTORIES 


u like 
In the first test the subject was asked to “pick out the pal А The 
best, next best, next best," etc., until all pictures ll ace four white 
scoring was directly dependent upon the ranks assigned t pec 10, 
faces. For example, if they were ranked 1, 2, 3; 4. the score e 26. 
lowest possible. Chance scoring of the ranks of white faces E: wes for 2 
Tn the second test, companions were selected from the pictu 
“variety of imagined situations, as follows: Н treet сат. 
^. 1. Show me all those that you want to sit next to you in as chool. 
| *2. Show me all those that you want to be in your class а 5 
3. Show me all those that you would play ball with. 
4. Show me all those that you want to come to your party. 
5. Show me all those that you want to be in your gang. 
6 
7 
8 


h. 
Е lunc 
you want to go home with you {0 


А г ovies- 
you want to sit next to in the m 


for both forms of the test. ; 

Tn scoring this test the relative frequency was computed wit 
П activities, pis 18 
tuations test. In its final form опо 
ich there were children engaged in pildre? 
: (D all white children, and (2) white С illus 
with one or more Negro children. The activities involved may d eat 
ting in the ice-cream parlor, at how eter 
the d “to se. Hop. The question asked the children Y alon% 
iun nen to join in with them and do what they are doing 


h which 


o^ 
€se tests was deduced nter? sid? 
more from their 1 5 
sistency and interrelation than with any measurement agains р}? 
measures of race attitude, Or the tests themselves, Test I, th® nu 


MEASUREMENT OF ATTITUDES 459 


well as when the races were separated. Only children of Communists 
showed no prejudice toward the Negro. 

E. C. Hunter's Test of Social Attitudes! contains opportunities for 
expressing one's attitude toward a single race, war, economics and 
labor, social life and convention, government, and religion. Before each 
statement there are five numbers—2, 1, 0, —1, —2—by which a sub- 
Ject can express five degrees of conviction from 2 if he is strongly con- 
vinced the statement is true, through 0 if he is undecided, to —2 if he is 
Strongly convinced the statement is false. Norms are available at the 


college level only. 

The question of race atti 
There were no neatly scaled questions runn 
extreme disfavor but rather а set of natur: 
questions were asked. Two samples follow: 
anning to buy à house and to move on a street 
families lived. The people who already liveo 
ke to have this family move there, and went to the pro- 
de him not to sell it to the Jews. 
o desire to move on this street? 
eady lived on the street to try to keep 


dren has been investigated.? 
ing from extreme favor tc 
al situations about which 


tudes among chil 


I. A Jewish family in Chicago was pl 
where only native-born American 
on this street did not li 
prietor of the house trying to persua 
1. Was it all right for the Jewish family t 
2. Was it all right for the people who alr 
them from doing $02 

3. Ought a city to be divided into sections or quarters and each racial 
live in its own quarter? 

4. If the Jews had moved on the street, ought the other fai 
to them? 

5. Would Jews usually make as good neighbors as other people? 

IL. A football team from Gordon College, Indiana, was to meet the team from 
Corliss College, Alabama. A Negro played on the Gordon College team. The 
team from the South sent word it would not play if the Negro was in the line-up. 
E the Gordon coach kept him out of ue tn 

. Did the coach do right in keeping the egro ou 
2. Would it have been better for Gordon College to cancel the game? 


3. Would it make any difference i f the North? 

4 the game had been played in the Sou 
+ If you were a college athlete, woul 
Negroes? 

+ If you were a college 
that had 2 or 3 Negro 


4 Questions could be answe 
5, certainly" was considere 


1 . 
1938 unter, E, C., А Test of Social Аййи 


group to 


milies to be friendly 


would you just as soon play against a team 


players 45 one that did not? 


red in five ways, with scores & 
d the desirable answer: 
des. psychological Corporation, New York, 


d, R ph D. X A d ii ies i aracter, Vol. 4 
2 Wes š Iowa Children, Studies in Ch 1 ; 
E n а oce Ans y perio of University of Iowa. 

» University of Iowa, . 


athlete; 
s follows when 


462 PERSONALITY INVENTORIES 


. cial- 
troversial issue will modify attitudes substantially. Without Spe 
contr aching, changes in attitudes are slight. ниде, 

E far this chapter has emphasized the importance of z ms 
the need of defining and describing them so that their васе of atti- 
detected, and the need of constructing scales on which c m dd EN. 
tudes can be reflected. It has tried to make clear that atü ortant are 
tremendous importance in their effect on personality. So P llustrations 
they that their development cannot be left to chance. Some 1 clear tha 
have been offered of how attitudes may be modified, It ayes ity ав 
they must be attacked directly and specifically, not indire 
generally. 
It is therefore recommended: 1 lis 
1. That we begin work immediately on deciding upon a sma таё. 
outstanding attitudes; those which most people would count Бе their 
2. That we so define these attitudes and describe them tha 
presence can be detected in the behavior of young people. ttitudes 
3. That we then proceed to construct scales on which these iss ver 
may be reflected and apply them in such a manner that we can es We 
certain that our procedures ате producing in attitudes the chang 
want. 


t of 


SUMMARY 


2 tud 
Since no over-all list of measurable attit! f 
had been made, lists of th 


cr 
their applicability to con 
Situations or so Specific that the mere construction of scales for : 
would be Prohibitive, The Thurstone 
struction 


MEASUREMENT OF ATTITUDES 


was clear that such changes require spe 
les are useful instruments for registering 


Evidence was offered that sca 
the amount of change. 


463 


cific, not general, instruction. 


QUESTIONS AND EXERCISES 


" 1. What is an attitude? Name four or 
уе important attitudes. 
2. How are they learned? Why are 
they so important? 
: b: What lists of attitudes are there 
which indicate the outcomes of educa- 
tion? 
4. How would you go about testing 
Such attitudes? 
5. What would an ideal attitude 
a be? Compare with Thurstone's. 
n what principle are Thurstone's 
Statements scaled? 
5 б. What is The Bogardus' 
ocial Distance? Evaluate it. 
ant How do the scales of Remmer 
ar er from those of Thurstone? Evalu- 
€ (a) the former's construction and 


Scale of 


BIBLIOGRAPHY 


Books 


A Bare, T. H.: “The Measurement of 
Attitudes,” in T. Н. Briggs, et al, The 
tee ee Altitudes. New York: Bu- 
Col, of Publications, Teachers College, 

olumbia University, 1940 


Bruner, HERBERT B., 
7 LINDEN: A Tentative Check List for 
ctermining the Positions Held by St- 


dents on Forty Crucial World Problems. 


a York: Bureau of Publications, 
1935 ^^ College, Columbia University; 


Lenz, Turovore F.: C-R Opinionaire 
Conservatism-Radicalism) - St. Louis, 
Washi Character Resear Institute, 
ashington University, 19 
ны ERENZ, ALFRED S. 
тар. Orientation T 
пре] for High School and 
ро Beles: California School Book De- 
Mo 1935. 3 
RPHY, GARDNER and  RENSIS 
Likert: Public Opinion and the Indi- 


and ARTHUR 


arrangement, and (b) its principle of 
construction. 

8. What are chief characteristics of 
Smith and Tyler’s Test on Beliefs and 
Social Issues? 

9. How are Hartley’s pictures used 
to show attitude toward race? What did 
he discover? 

10. Describe Minard’s study of race 
attitudes. What was discovered about 
the change with age of attitudes toward 


race? 
11. What other instruments are there 


for measuring attitudes? 

12. Describe an experiment for the 
study of the change of attitude toward 
an important institution or idea. Name 


the scale you would use and describe 


precisely the learning procedures. 


vidual. New York: Harper & Brothers, 


1938. 
Newcoms, T. M.: “Social Attitudes 


and Their Measurement,” in Gardner 
Murphy, Lois B. Murphy, and T. M. 
Newcomb, Experimental Social Psychol- 
rev. ed. New York: Harper & 


Brothers, 1937. 
отн C., and L. L. THUR- 


PETERSON, 
STONE: Motion Pictures and the Social 
Attitudes of Children. New York: The 


REMMERS, H. H. and N. L. GAGE: 


Educational Measurement and Evalua- 
Related Aspects, 


Surru, EUGENE R., RALPH W. TYLER, 
etal: A ppraising and Recording Student 
« Evaluation of Social Atti- 
203-244. New York: Harper 


& Brothers, 1942. 
SMITH, т.: An Experiment in 


Modifying ‘Attitudes towards the Negro, 


464 PERSONALITY 
Contributions to Education, No. 887. 
New York: Bureau of Publications, 
Teachers College, Columbia University, 
1943. 

Тновѕтоме, L. L., and E. J. CHAVE: 
The Measurement of Altitude. Chicago: 
University of Chicago Press, 1929, 

WRIGHTSTONE, J. W.: Wrightstone 
Scale of Civic Beliefs. Yonkers, N.Y.: 
World Book Company, 1938. 


Articles 


BoGARDUS, E. L.: “А Social Dis- 
tance,” Sociological and Social Research 
(1933) 17:265-271. 

CAREY, STEPHEN M.: 
Attitudes and Actual Behav 
nal of Educational Ру 
28:271-280. 

GARDNER, Iva Cox: 
& Group of Social Stimuli upon Atti- 
tudes,” Journal of Educational Psy. 
chology (1935) 26:471-478. 

Hincxtey, E. D.: 
Individual Opinion on 
an Attitude Scale," 
Psychology (1932) 3:283-296. 


" Professed 
lor," Jour- 
ychology (193 7) 


“The Effect of 


INVENTORIES 


Purdue University (1936) 3717-126 
Honowrrz, E. L.: “Тһе ce serm 
of Attitude toward the Negro," 2176 194. 
of Psychology (1936) Vol. 28, No. tion 
Келу, Ina B.: “The Core Eire 
and Validation of a Scale to hen n" 
Attitude toward Any Insti. 
Studies in Attitudes, Studies in ic 
Education, XXVI, Bulletin of Ри" 
University (1934) 35:18-36. 
law — “д "Technique pos 
the Measurement of Attitudes, No. 
chives of Psychology (1932) Vol. 22 
140. itu 
Міхавр, Ватрн D.: “Race ЫЫ, 
of Iowa Children,” Studies in pne ; 
Vol. 4, No. 2, University of Той chil 
PETERS, F., and M. ROSANNA: Jnflu- 
dren’s Attitudes towards Law as ante 
enced by Pupil Self Govern s in 
Studies in Attitude, Series 11, Stud in of 
Higher Education, XXXI, Lucr 
Purdue University (1936) 31 15-2 tti- 
REMMERS, H. H.: “Generalize ologi- 
tude Scales Studies in Social-Psych Attic 
cal Measurements,” Studies iS E 
tudes—A Contribution to -* 
chological Research Methods, Bu 
in Higher Education, XXVI, 1 
of Purduc University (1934) 35:7 


des 
ег; 


CHAPTER 18 


Measurement of Personality Traits 


› as used in this connection is a much narrower 
1 its broadest sense personality may be 


tl ч 7 TUNE 7 
нам of “аз the total quality of an individual's behavior.”! In this 
I Sea every instrument that has been studied in this text would be in- 

uded as well as those contained in this chapter. But after instruments 


of me А : i i i 
measurement of achievement, intelligence, interests, and special 
a felt need for getting some 


Gian 
Apacities had been developed there arose 
nality characteristics which 


Sort 
em of measurement of those other perso 
m so large both in the individuals adjustment and in his interaction 


on others. Personality inventories and tests came to include those 
s and characteristics not included in tests of intelligence, of achieve- 
uu, or of special capacities. Моге particularly, these inventories came 
? refer to those aspects of emotional adjustment which contributed to 
Personality balance and integration. Many of these traits exhibit their 
ading characteristics when individuals’ strongest desires are thwarted 


and they can find no satisfactory solutions for the resulting problems. 
roach to a better understanding 


As in other areas of behavior, арр aie т 
bjective and the subjective techniques. 
is made either systematically or 


Obtai 
Ptains through both the OPES 
Jectively, observation of behavior c s 
of the behavior. Thus the experi- 
{ а classroom and to one side in 


Ortui А " 
Uitously and then ratings are made 
15 or tics— which appear. He has 


me 
ei takes his place at the front o 
Order to observe the oral habits-spas® à 

observation and may repeat the 
d. Both the ratings of 


in th; 1 

this manner narrowed the field ol 
sults can be secure 

ations concerning a 

ubject to errors of 


Ж... term "personality" 
е than when used ordinarily. П 


o 
pp m so that reliable results C^ observ 
5 = and the continued recording of спат É uti 
int €nt or pupil are objective 1n nature and ar | И К 
ku €rpretation. On the other hand, if we can gain the cooperation of the 
wie himself so that he is willing to answer directly those questions 
ich refer to his emotional life we can gain a great deal of information 

А оце his adjustment in а short time and with far less expenditure of 
nergy, Tt is this subjective approach to the understanding and evalua- 
^ Woodworth, R. S Psychology» 1940 edition. New York: Henry Holt and 
™pany, Ine., p.137. —— . 
465 


PERSONALITY INVENTORIES 
466 


had 
which has 
tion of adjustment whi i nce and 1 

i dj nt which first made its арреата : 

s e ана done upon it than the objective approach 

ar mor ) 


SELF INVENTORIES OR QUESTIONNAIRES 

It was Professor Woodworth of Col 
was called upon to develop a general sc 
which would indicate the presence of 
maladjustment so that such cases cou 
army program or made sub 


in 1917 
umbia University who ^ forces 
reening test for our ipe mental 
the more severe types " om the 
ld either be eliminated fr 


No 
? Yes 
9. Does your heart ever thump in your ears so that you cannot sleep? "ed i 
19. Have you ever had fits of dizziness? Г Yes MG 
29. Have you ever lost your memory for a time? | Yes n 
39. Did the teachers in School generally treat you right? ү? Yes Mà 
59. Do you ever have a queer feeling as if you were not your old se Tes 
79. Do you feel like Jumping off when you are on high places? 


TORIES OR 
FUNDAMENTAL Drericurris WITH SELF-INVENTORIE 
SELF-REPORTS алатай 
ely perceived difficulties lies in LP thui d 
9- 9 above, which asks about t е 


One of the most immediat 


je 
А sub? 
ауе one thing in mind which (^ su к 
misinterprets, In the second place, a 
may not know the answ 


gotten or 
Presence, Thus, « 


intih ре 
Е š © subject honestly to divulge the w that ay 
experiences of his daily life, He may not want anyone to kno e he P) js 
has fits of dizzines s lost his memory, and her d 
urth place, it is very айс 
! Items by permission of C. H. Stoelting Co., Chicago. 


MEASUREMENT OF PERSONALITY TRAITS 467 


cover personality traits that are really disparate segments of personality 
which do not overlap too much with other traits. Some of the A Em 
sions studied are dominance-submission, introversion-extroversion, self- 
confidence, self-sufficiency, etc. The burning question here is whether 
or not these dimensions are realities in the life of individuals or merely 
Constructs in the mind of the tester. Are they independent of each other, 
e ee so closely related that one of them correlates highly with the 
er? 

It is for these reasons that scores from psychoneurotic inventories are 
Dot to be interpreted as are simple reading scores or even those from 
intelligence tests. All scores must be regarded as tentative and experi- 
Mental, as aids in interpreting the whole individual. For example, no 
Опе would be justified in interpreting a high score on the Bernreuter 
Neurotic score as indicating a definitely neurotic condition. One could 
certainly interpret it as indicating that such a case needs further study 


9r as confirmatory evidence of a condition which had already been sus- 
ect. Such a score then must be 


Pected because of the activities of the subj 

interpreted in the light of the whole individual and not as а discrete 
entity. In general, three limitations of these inventories must ever be 
kept in mind. In the first place, in no case has the validity of an inven- 
tory been adequately determined. In the second place, the reliability of 
Separate traits measured by the inventories are rarely high enough for 


individua] diagnosis. One must remember that an individual's diagnosis 
ased on a reliability as high as .90 is subject to considerable chances of 
error, the efficiency being 56 per cent. When profiles are constructed 
"ai On separate traits whose reliabilities are around 75, the con- 
€quences are almost ludicrous, for the efficiency of prediction is only 
a Per cent. In the third place, the dimensions are not independent. A 
ae on one dimension of personality may be made up partly of a score 
А previous dimension. In а few cases, inventorie 


s have measure 
E as that are independent. Such independence involves rather elabo- 
te statistical treatment called factor analysis. Even when the factors 
Те computed all the investigator knows is that there is a Factor I, a- 
actor IT, and Factor III which are uncorrelated, t.e., are independent. 
€ name which he gives to each factor is dependent upon its relations 
9 Various measures and his own psychological insignt. In the first bn 
aij, Үе well-known psychological traits whose independence is pro E 
*cal; in the second, we have efinitely independent factors whose 
eu are problematic 5, To sum up; the e es шору 
Personali Me importance. The tec iniques 0 r 
А ен Иони БУ. M OM desired stage of validity. The 
€rpretation based on scores from. these instruments must be tentative 
d such instruments 


a : А 
nd Contingent. If proper precautions are exercise 


68 PERSONALITY INVENTORIES 
4 


x and 
r ; f children 
f very great importance in studying the personality o 
are ot ver 
adults. 


TYPES or SELF INVENTORIES 


jons 
ix dimension: 
ieldi ix dime E 
pu he advantage of ielding s А 
“yes—no—?”’ type, has t g y ensionsáfe RE 
Meo 
dominae Sebi а 
he test was first sd tha 
f 
ut later it was learn posed!) 
d between these ume icis™ 
; 5 е 
rrelation between n ascen 
se 3 
ole ai 5. The: 
een neuroticism and self-sufficiency, cp Flanag? í 
coefficients except i -sufficiency suggeste м ^ 
€ of these coefficients. 
his work indicated th i 


cte 


be use" oh 
actors, Flanagan concluded, could Uu 
the entire four, and of the two, the 


first ia 
In six different ways. Logically the nd 
dimensions should have Ь i 

Some further 
sideration of it 
making of thi 
Self-sufficienc 


con 
facts about this 
3 construction, Е 


. 3 оше, ей“ 
mension thereby derived W dim 


А le the "ma 
original test. For example; hurst? 
ticism would havea 

Personality S 


ji 
SM 1 pe 
h it was derived. and t 
Samples taken from the ernreuter Personality Inventory 

scoring on six different keys 


1 Stanford University Press, Items by 
* Flanagan, J. C., Factor 4 nalysis in ih 


d 
f Y ene € Study of Personality. Stanfor 
Calif.: Stanford University Press, 1935. 

* Items by permission of 


mu 
it 
m 
Permission, Univ 


Stanford University Press. 


MEASUREMENT OF PERSONALITY TRAITS 469 


1. n 
Yes No ? Does it make you uncomfortable to be “different” or 


unconventional? 
E ues No 2 Do you daydream frequently? 
< Yes No ? Do you usually work things out for yourself rather than 


" get someone to show you? 
- Yes No ? Have you ever crossed the street to avoid meeting some 


person? 
5. Yes No ? Can you stand criticism without feeling hurt? 


The scoring on the different keys fo 
accompanying table. In this manner 


r the first two items is shown in the 
the 125 items may be scored in six 


Extrover- | Domi- Lack of 
: Self- sion-intro-|nance-sub-| self- | Sociability 
Neurotic |suflicienCY | version | mission | confidence 
— == ———| 
l. Yes 2 cu 1 —3 1 —2 
No —2 4 eit 3 —2 3 
? 0 1 EI —1 3 —3 
2. Yes 5 1 3 = 3 2 
Мо ak zz EN 1 —5 —3 
? 2-5 —2 0 2 0 5 


different ways yielding thereby six dimensions of personality. Because 
cach dimension is derived from all 125 items, its reliability is higher 
ms as would have been the case 


than if it were derived from 20 to 25 ite е c 
ad separate sets of items been responsible for the score in each dimen- 


Sion. The reliability coefficients for the first four dimensions range from 
89 to 92 ле der and from .85 to .88 in the other; while the self- 
Confidence dimension had a reliability of .86 and that of sociability, .78. 

Such correlations would rate students 70% of the time on a five-step 


Scale and practically all the time with an error of one step on each 
Scale,” p 


а ince there аге no readily 
results of this inventory the 
lation himself unfortunately Pres? 
the between the dimensions secu : 
vali; ena! tests from which his ; 
ben This seems а trifle like agen 
Ite nality inventory had in it ediles 


s 
ех from Thurstone's Тарай a S circi 


inst which to measure 
arily in doubt. Bern- 


nts in his manual coefficients of corre- 
red from the scores of his test and 
was constructed as evidence of 
ng the question, since his own 
from these very tests (i.e., 
Laird's introversion- 


Thurstone's neurotic 
? Journal of Educa- 


riteria aga 


: validity jg necess 
n 


Multi-trait Tests,’ 


UB 
Hio, авап, J. C., “Technica 
Sychology (1935) 26:641-6 


470 PERSONALITY INVENTORIES 


| but 
inventory and Bernreuter's dimension of “neurotic” was uses 
Thurstone’s neurotic inventory was the source of many of Ber 
items. One could hardly have expected a different outcome. sonality 
A more severe test of the validity of the Bernreuter erence 
Inventory was made when it was used to test the emotional dworth, 
between the sane and the insane.! Three inventories by ba hing the 
Bernreuter, and Page were found not to be useful for distingu i 
normal from the insane. Further evidence of evaluation арра been 
contradictions in the results of the uses to which the javen oo tn 
put. Two investigations found no great assistance from the = one 0 
in differentiating between problem and nonproblem groups: res an 
these? correlations were made between the inventory SCO roblem 
counselors' ratings, with very low coefficients as results. But E san 
cases frequently involve moral as well as emotional maladjustm?” other 
moral traits lie outside the boundaries of this inventory. On паре in 
hand, another investigation! found the inventory extremely V2 ts in^ 
distinguishing between well-adjusted and maladjusted АШЫ е 
situation involving consultation service. Tt is possible that 5 


e 
E hat th 
came more nearly giving forthright responses when they knew t Like 
Scores were to be used to keep them adjusted more adequate" „це 
wise the Bernreuter Inv 


З е 

« ME entory was found to be of considera 
аз an aid in the diagnosis of psychopathic inferiors.” not COP. 
It should be pointed out that Bernreuter's Inventory Was 


) 

eR. үс (25216 

structed to distinguish between the normal and the psychotic Ceci 

but between good and poor adjustment in otherwise normal 9^ ур 
In summary, the 


н Bernreuter Personality Inventory has рево do 
lished for long enough time to discover something of what it n Jead- 
More than 100 studies have been made with its dimensions aS e nce 
ing variables. The results are not clear. In one case, the self-co” "s 
Score was fairly valid but much less so was the sociability 5CO7 al y 
inventory seems to be of more value in differentiating the bane 

maladjusted than in differentiating the psychotic. Criticisms ре d р 
leveled at self-inventories apply to this inventory. The su J 


yer 
andi T > t 
* Landis, et al., “Empirical Evaluation of Three Personality Adjustme? i 
tonia; Journal of Educational Psychology (1935) 26:321-330. tribut? 
Jon, Бл, and A. A. ohns, “Does the Bernreuter Inventory СО? 
Counseling? Educational Research Bulletin (1938) 17:7-9. aid iP 
Speer, G. S., The Use of the Bernreuter Personality Inventory 45 am 
Prediction of Behavior, y 


; 
* Journal of Juven; :65-69. 100 
* Stogdill, Emily, and Minnie E, 20706 Research (1936) 20:65 7. pave 


ty 109.2997 
+ omas, “ ter Personal 9: 
as a Measure of Student Adjustment,» Ji egere Psychology a 8) bf 
315. osi 
« a jag” 
5 Hathway, S. К., The Personality Inventory as an Aid in the 2111. 
Psychopathic Inferiors," Journal of Consulting Psychology (1939) 3:11 


MEASUREMENT OF PERSONALITY TRAITS 471 


difficulty in inte i i 

È rpreting the items such as: “D 

ue Г : 

Ew cp hes may also lie about an item mi i aries 

me ye side, this inventory does have two Siam з 

кл ed or independent. If, as Flanagan's study india : 7i З 

ek ate e variance discovered is due to Factor F;-C, which s 

here -confidence, then Bernreuter's Inventory might best be d 

"ppm e this trait. Thus the lack of self-confidence looms u pin 
Low > peces a the aio a of personality. а 

BP  sel-inven ory resembling in general outli i 

epe is the Bell Adjustment Inventory! which чы ня сү e 

nd a student form. This instrument was constructed from ils ce 


item i 
s of the Thurstone Personality Schedule and an additional 188 new 
As with other such tests, each item was 


"re pom by the author. 
differ nl. the items as a whole by discarding those which did not 
i tha me between the “upper and the lower 15 per cent of the scores 
Sa олар for each category.”* In addition, the criterion of 
the а ity (i.e., the item must be checked by at least 25 per cent of 
an ape group) was also applied for retention of an item as well 
ея е which eliminated the items which were sometimes misunder- 
the im rom this rather rigorous process 140 items remained to compose 
(1) е Four categories, under which there are 35 items each, are 
A e adjustment, (2) health adjustment, (3) social adjustment, and 

Ени: otional adjustment. The adult form adds another: occupational 
ери 'The author claims that these divisions are concrete and 
ternis ive and that the counselor and his counselee understand these 
. The overlapping among the divisions is not large. Intercorrela- 


tio; 
ns among the four divisions range from .04 to -53 and average 35. 
he accompanying table, is 


as MINUS of the inventory, shown in t 
gh as one could expect. 

Division Reliability 

Home adjustment... 00 .89 

Health adjustment... ttn .80 

Social adjustment... .89 

sustment. «ttt .85 

.93 


Emotional adju 


mua matin oe POM М 
y simply counting 


lso be scored with 
hted scores are 
core) and 


Bhts rangi + 
n 
ging from +6 ШОШ ^ with .93 for the totals 


А 

Sli 
ght m 
y more reliable (.95 as со p 


! st 
2 anford UN à 
ell, H niversity Press- ice of Personal Counseling, page 25. Stan- 
, Hugh M. and practice 0j > 
, The Theor. e University Press, 1939. 


for 
hiversi ni 
lversity, Calif.: Stanfo 


474 PERSONALITY INVENTORIES 


è reedom 
belonging, (e) freedom from withdrawing tendencies, and (7) f 
from nervous symptoms. 


2. Social adjustment, with its s 
social skills, (c) freedom from 
(e) school relations, and (f) 


e 
"visions 2 
Up through the inventories suitable for grade 10, these divis! 


т 
Jn 
: В Я є А akes. 
printed in plain view upon the inventory which the subject ta" je 
the secondary A and adult seri i 


А ds, (0 
ubdivisions of (a) social stands E 
antisocial tendencies, (d) family 
community relations. 


H H e. 
Shown in the accompanying tab 


1 2 Total 
Self- | Social | com 
H = n 
adjust- | adjust- | pone 


ment ment | 
.922 
Intercorrelation of 1 and 2 = ‚66 Primary А ‚893 ‚813 ‚933 
‚66 Elementary B .888 .867 ‚932 
.74 Intermediate B .898 .872 ‚931 
.54 Secondary A .904 .908 ‚918 
76 | Adult Series .888 .898 


eas © ons 
igh to locate more restricted аг elati 
betwee ae d From the table also it is clear that the COIT " 
Ge Fubstantial) 4 et divisions of the Inventory vary from test 9 ef 
i Oi igh). Ey, a yh ы d of this 

rom being uncorrelated, en the main divisions of 

The claims for validi 
large 


Previously bee. ЖИ » The 

В : ted b kers. 

selecting the Items for the fing form a үз е E four: 
(a) The judgments or t 


Ja" 
ir re 

е E each TN. ing the" ipg 
tive validity and significance ( and principals regarding 5 


Е re / 
b) The reactions of pupils exP 2.9 
1 Manual of Directions, 


Californi y 
permission from Californi 3 


jes, P" 
aT est of Personality, Elementary Ser 
“St Bureau, Los Angeles, Calif. 


MEASUREMENT OF PERSONALITY TRAITS 475 


the extent to which 
they felt confident and wi i 
illing to gi 
m (c) А study of the extent to which eus] e ud ied 
* er appraisals agreed. (d) A study of the relative si uem d 
ems by means of the bi-serial r technique. Vl de 


ocedure by means of which each item i 
чп asure its degree of agreement XO the 
oen а whole. The manual furnishes only this bare outline of di 
ion of items without any statistical confirmation. Th s 
ei to disguise the meaning of the items in the е. ae 
hys thor intent would not be too apparent; 2.5. the НАЕ 
dee oe É but Are some people so unfair that you try to cheat?" 
де ch a question as Are you mean to people?” but rather “Are 
T е often so bad that you have to be mean to them?” 
"e low visibility of items, however, is immediately negated in the 
@ тее inventories by clearly printing the names of the components 
Т e face of the inventory which the child takes. 
"s 2 pou for administering, instructions for scoring, and norms 
Bor rs at could be desired. The norms are percentile scores easily read 
the © he tables so that the profile consisting of the main divisions and 
bes welve subsections may be easily constructed. Were these total 
un s certainly valid and the subscores valid and reliable, and did they 
UA the one with the other, no more desirable graph could be 
tion ructed than this one based on such important outcomes of educa- 
и i While the profiles are useful, they must be received and considered 
fem e full light of how uncertain their meanings are. Above all, we must 
б ле that these scores are based on what subjects say they feel. In 
Shel eee there is no doubt but that this series of inventories are as 
Re for school purposes as any others. The manual uses five full pages 
fus E. of individuals who deviate too far from the normal. It is ques- 
e whether the discussion of so difficult a problem in such a small 


чу might not be a dangerous pr 

) \nother inventory suitable r children (grades 4 to 

is Aspects of Personality by Forlano, and Alster.’ 

sonality, practical orrelated with each 
om the scores: (1) ascendance-submission, 


ree dimensions of per 
o 
ther, are obtained directly fr 
ion, and (3) emotionality. The items for this 
hich had already been con- 


extroversion-introvers 
eveloped.? The language 


This biserial r technique is a pr 
correlated with the total to me 


ly unc 


te k Е 
тя were selected from seven inventories W 
ructed and also from new items which were d 
1 
Б World Book Company; Yonkers, М.Ү. Items by persmission. 
Woodworth-Matthews's Ps ic Questionnaire; Allport's A-S Reaction 


St x 
udy; Thurstone’s Persona! . Pintner's General Opinion Test; Bern- 


Teuter lity Schedule; 
n M Personality Inventory; aller's Character Sketches; Lecky's Individuality 


PERSONALITY INVENTORIES 
416 


chil- 
th-grade 

he items was simplified in order to fit the level of four 

of the ite 


them- 
i they А 
ldren could not understand the cede from th 
l ere allowed to reword them. From the invento 
selves w 


rec. 
re secu 
atements we е an 
items suggested by the authors about 900 ч ЕП ы апо i 
The items were rated by the authors on the ba 

ese 


И s he dim 
aving an item correlate high with th 
under which it falls an 


1. 
others, was carried p^ 
A rather interesting Personalization of the items was 
follows: м 
І don't like to ask questions in class [S р] 
Ilike to play Tough sports 5) D 
T feel tired most of the time B P] (D) and 
The subject жаз to think Whether he was same (S) or different 
cross out the Proper letter, 


The reliability oft 


half 
and the test-r 


р-р 
the Sp’ ел 
he inventory Was tested out both by ission ee 
or the rin ас dag are P 
Clents were -69 and :65; for the Extroversio 
imension, -10 to .76 


à 79 t0 " 4 
Я imension, - 
;/9; and for the Emotionality dimensio 
1S would indicate reliability su 
i inad 


adequate for indi 
would be in follow. i 
items. Percentile standards are f low in ea¢ - 
gestions as to What to q dren who score very velopme! p? 
these dimensions, But the number of cases used in the me up Wop 
the standards is not mentioned, The manual itself p this uae 
general caution; it advises that “no simple group putt atten 10 р 
diagnose, but it can indicate children who need care d from £s 
However, one must not regard percentile scores E ae of a Св of 
inventories as anything More than “a general descripti 


osi 

diag” 
personality” and must Not expect it to bé a. tao accurate 
personality difficulties, 


MEASUREMENT OF PERSONALITY TRAITS 477 


As in all other such inventories the validity is weak. It has not been 
measured against any sound criteria outside itself. Possibly also the 
instructions are a little infantile for children in grades 8 and 9. One has 
the distinct impression that it is pitched on a fourth- or fifth-grade level. 
There is also no definite suggestion in the manual for careful observa- 
tions of behavior as supplementary to the inventory's scores. 

The Maller Case Inventory! is constructed somewhat differently 
from the inventories already described. It is divided into four parts: 


1. Controlled association test 
2. Adjustment test 
3. Honesty test (self-scoring) 


4. Ethical judgment test 
t, 50 words were selected out of a list 


Tn the controlled association tes 
of 200 items. These 200 words were collected from the Kent-Rosanoff, 


-association items. In these original lists а 


Word is given and the subject responds with the first thing that comes 


to mind. It was found that there were * usual" responses for normal 
al responses for subjects with some 


Subjects and individual or unusu 
€motional maladjustment. Using this procedure Maller selected re- 
Sponses which (1) were usual, and (2) were “uncommon, personal, 
emotional, or involving superstitious ideas." He tried out these 200 
items on (1) adult insane groups, (2) adult normal groups, (3) pro- 

dren and found that 50 items 


ationary children, and (4) normal chil d | 
distinguished clearly between the normal and the probationary children. 


ere are two sample items: 


Jung, and other lists of free 


The subj ; he two words on the right which is con- 
ub lines one of the tw > rig 
Decteg um Fee word on the left or he may write in à word. 
The adhu - Е | t jg made ир of items selected e pes s 2o 
а 281 IF 9 t Е: о 
Character! stment tes jection of items was based on а "thoroug 
: sketches, The se s of well adjusted children and 


i 1 ü onse Я д 
analysis comparing the Pun cases, delinquents, and psychiatric 
= xtreme introversion, lack 


Ults with ^ 

; those of serious p” : € 

a The undesirable responses а inferiority , and symptoms of 

~co ing of inadequacy ask he feels the 

© кк pun ы » The subject 18 asked whether 
am = 

€ (S) or different (Р). 


4. 
14. Sometimes has a feeling that t 
ates people who tell him fran 


l. 
ings are not rea : . 
uy what they think about him. 


rsity. Items by permission. 


1 
T а iv 
Sàchers College, Columbia Unive 


"uorssiuuiod Aq surat] `55914 uonepossy HIOA MIN " 
odun urea JO Ч0159504 sy 


b 39231 L 
оз әреш 512 шәз1® ЧЕ С ISIL, up 


P экш ы səl KpoqÁ19^d 100925 ur pi 1e[ndod jsour oq а 
coy gop oh ium cs ae РТ Чар, чорна aq 
«10025 Ш ouoKu jo soon 359210 мрн e j^ ux uns ae ® 
aq} jo yed 15} ot 10} Sunninsqns q P q no 


gaoy әз ISP 9d OF YS T og 


soK 
9Uj pt É 219 әз ASME T шү 
SU, | p л == jooups ut 118 159131914 oq sr Азр 
-uonensnirt Ч® Aq pourejdxo 3599 SEP «ISIL, 

[ puvist 3195Әр v uo әл 0} Келе Surog 
әләм noA у nox чум 9x93 OF asooqp proa. nox enm Jo Soury 
aq umop Sunua Jo uo[qo1d plo 9m uo qmq S! € илә» 3Uoso1d jy 

E ey o} 199819 29 01 “лә8полу$ 


aq 03 10 ‘ouou әлош 9A к 
стоме e 29835 € Uewod 


uq 2q (ir 'puooos aq, 
оз qsqa pro^ пок uos1ad [0 3105 9[] чут 
за op о} 9A? yor Jo puooos рит Isay 
т 591005 ,onsouSeIp,, 110] AI, 
"suro[qo1d sty 
qs ‘sə109S 99113 1oqjo OY} YHA uev? ‘pur әд 
soje»Iput 21025 Surureo1pÁep әчү, ^p 
'sjuo1ed цз0 10 9uo uo aouopuodop (nu 00} 51 919] I9g]9uA 
10 ‘рәзпемип Suq jo Sur] ® st 21e DPA ‘yuasaid 51 5801415 ло 
syuored jo Asnoyeal io]joqA soyeorpul 91028 juounsn[pt-A[rurej эчт, ‘є 
‘STES Teros 
1ood ‘s}28}009 dno13 sty ut Addeyun si 
os quounsn(ípe[eur[9D)0$ L "c : 
‘Zuryoo] poo sso[ *19xv9A Imp 


uey} 3ur[oor 19319q 
aq oj qst jqgpur 240 
? aq оз JUVA 30810 IUO “soyst 
‘gyesinod adueyo no& pmo? aq 
op оз seq 3s1g oq ‘599814 Ч 
ay} 491993, , XIS ШОЛ} poAuop әт 


BULATOS st pry? oY} MOY SAO 
Asequey s,prqo oq) jo 3u91x 9m 


ay} ur 1004 'spuon 2ur[eur e 
aq yorga оў зшәзхә oq] $ә105%9Ш 910 


'suorueduroo sry uey} *o[qudo sso] 
“g4—oyenbopeut Á[[ejuour 10 Kqpeois&qd oq 03 jpsumq syu} pryo 


ay} yora оў ooigop əy} SƏEPUL 2109S f3uouojur[euoszod oq] ^ 
"suorvropisuoo [eorvad 03 Зитрлооов рәртА1р 9109S INSOUBLIP pares 
-os INO} uo paseq әле posn sUOIsIAIP ӘЧ], *osvo үепрілтрщ oy} JO Apnjs 
рәрезәр ay} 10] Kqrunj1oddo ue se *K1eurums [eonoumu v Se Á[o1our 
uodn рәҷоо[ 21e sa100S I} *axe[d puooes ay} UJ PPVU uooq peu Apnys 
әлцѕпецхә ue urogA uo ‘UPI uro[qo1d 043-43] JO Ápn3s ? uodn 
sumou oy} Sutseq Aq poinss? SEA Kyrpi[e^. 51] /$1unoo 043 UO anbrun s 
“ст 03 6 Sade 10} [njosn ;aueunsnfpy Áj[euosioq Jo 3591, s1330% eur ` 


б/? SIIVNL ALIIVNOSH4d JO INGNAXASVAN 


"£61 ‘SUOG souquo 
иптә ji PUD 5189.1, "Korg `0 "у pu 
иди uo siseydury tra шоолѕ 
ut S әп BUR 
o 22 


сушы» ї{лод MƏN 752202128 101005 IYI 
Бе LL AY ut ү/ф—,єў "dd "риоя 
[9 әй ur Bureau), “ур "y еро; 
«ngos Aros 9q UVO 31 910Joq AjtprivA jo 
опор 9d 93 5РӘӘЧ Yonu ‘әѕгшол4 s 


adi -gor101U9A UT тоо ошон UL Su: pays I 
aD and ou,L "9100295 I} JO qvo 10] o[qersa 


"o[q*IS9p 350ш st VEL 
пе opu ® se K1ojuoAur oq; 10] үә 
jou P yendod g UO paseq o1? suriou 9AT 
jo 99" ата 20у PUB souo ewou (ууу S 
30 шоо оз posn uooq sey Алоўцәлщ ә х н меан 
91e K0100AUt 911 10} #67 01 £6* шолу pu? ѕцотѕтдт 
sv - uro1] SutKivA *&1ojovjsryes $913 
=. ‹ y е 
96 ото m pur Siea] SIYSIM YPM Suieap 52931 JO sous e pu? ‘sataour 
р (qom e01991 S3[00q ШІ 515219jUI 'sn3vjs 9IUIOUO220120s $ qoatqns ə 
(gm Suumbut puo 91 те ѕиоцѕәпр jo 591195 ? seq ose KxogusAur oy, E 
m | 
зт 3noq? Зшцќив Avs з,чрїр рит p10221 oq; pry dyryg 
‘pauoddey рец зеца Jonjoui sy роз апда 


121047 A[peq pəz pmoa ләцўош sry ye 
aem *9j]0A?] 5.190103 SIY SUA уе p10201 v[01j91A v 9x01q pue impo ашчы 
4 du 


IsI 
dures әчО *sdno13 [euriou pue “әдә 'sxouoryeqoad 'syuonburpp uoo 


по1уеШІШ1Э81р TO} JO 51509 oy} uo po3»opos әләм SWI VAL ISIL, 


^d аар" f essa SAMH шолу PIAS arom SWJ әчү AT Weg uy 
1 1`*чзлол\увплуип әд 03 sjmsoz эчу Busnes Адәләцъ 
ш uon2es JO asodind əy} 0} uo qoe» prom чәлрүщә ләрүо ojmise 
аш yey} чч se»urAuoo jroop Sut&pnis ш onbruqoo? Supso1-o[qnop 
vx әоиәшәйхә 5,лоцупт dy} “әлошләцзлп g "[e19uo8 jou ‘oytoads әле 
em om Joq ШІ „(591005 359) чәў}їїл\-и-ш-Аўзәчоц,, SE Mq (891005 
bs ouod, SE Pat? 1diojut әд Jou p[nous Аузәчоц uo so105s ay} Á[qeqoiq 
«€ ut 
oq seat рлооәл uo qojeur Sutxoq 3s93uo[ әці,,, *o[durexo 10,7 oim 
Б jo рәзоәйзпв Аүәуетрәшш oq Хеш oSpopwoUy ipu) оў se шер 
Ku 3993 IMP OS әле Sisti SIE Jo SUIS “Sad dont pue 531005 jo зә, 
g Torey 4101} Рә32ә]$ SWH Jo рәѕоїшоо st “ysa} Aysauoy aq “с yed 
[1 


SHDHOINSANI ALIIVNOSUdd 8/7 


480 PERSONALITY INVENTORIES 


How many friends would you like to have? 
а. none 


b. one or two 

c. à few good friends 

d. many friends 

€. hundreds of friends 
о 


Do people treat your brother better than they treat you? 
. never 


а 

b. sometimes 

c. often 

d. almost always 

€. I haven't any brother or sister. 


š pject 
of this personality inventory the stg 
siblings, his best boy friend and be 


1907 
then “put a ‘1? in front of the рез, 


tailed 


e 
are given which difficult to follow 
Norms exe based Hus i are not too ашп а, 


h 
Я average,” and “high” for each of t jn* 
divis Бе, and “high” for e Jas 
lons. Finally the manual gives four case histories and €* 
y aids in their interpretation. table 
nventory is around .70, a score unsere B 
ndividual case but perhaps high enou£ t yeast 
ее "Леге the whole case history had been made. ^, үре 
ays, "We have used this test in our clinics, апа WY, phe 


1 
this €ncomium, the + ument of personality measurement In 5р he" 
more, the very valid est lacks independence in its divisions вй 
on low correlatio ‘cuties on which its claims to excellence lie roe 5 "i 
clinicians’ ratings, T (38 to 4g) between the scores of the "en ай 
personal ШЕН, an the divisions of family maladjustm ; T» 
inventory is a Clinicia; € correlations with ratings fall below а e» 
to the support of omis attempt to bring objectivity and meas b 


a "natural" division into f 
ing procedure. 
! C. M. Loutit, Nineteen Fo, 


‚В 
à al M earbook (OSC9T "3. 1 
ed.), Item 1258. Highland Park, N.J.: The TE enoni Fass N ok 


MEAS 


TABLE 17. List oF PERS 
Name Grade 


UREMENT OF PERSONALITY TRAITS 


481 


ONALITY INVENTORIES Nor INCLUDED IN TEXT 


Validity and con 


tents 


Reliability 


Publisher 


Bell School High school 


Link Inventory | 7-13 


Brown Person- |4-9 


Inven tory 


of Activities 
апа Interests 


ality Inven- 
ty for 
Children 


Т 
he Detroit Junior and 
ae senior hig 
огу school 
(Н. J. Baker) | cnp, 


Mi 
Rnesota. College and 
Seo last 2 years 

à n (John G.| of high 
rley and school 
alter J. 

IcNamara) 


L 
ofburrow- 7-9 
ys Personal 
ex 


Items selected to differenti- 
ate between upper and 


lower 15 per cent o 


f 450 


high school students. Meas- 
ures adjustments to (1) 


fellow students, (2 


) school 


plant, (3) school organiza- 


tion and offerings, 


school administration, (5) 


teachers 


Part 1 checks interests in 


games and studies. 


Part 2 is 


150 items from which scores 


can be obtained: (1) 


sonality, (2) social 
tive, 


per- 
initia- 


(3) self-determination, 


(4) economic self-determin- 
ation, (5) adjustment to 


opposite sex. M 
Р.О. (personality q 
80 items. 


Iay derive 


uotient) 


Total scores ana- 


lyzed into (1) home, (2 
school, (3) physical symp-. 


toms, (4) insecurity 


‚ (5) irri- 


tability. Questions are open 
and evident. Validity base 


on literature about 


rotic chil 
120 items assembled 


24 topics including 
hysical status, WO! 

fears, anger, Pity, 1 

sion, home sta 


the neu- 


around 
health, 
rries, 
ntrover- 


tus, reactions 


e sportsmanship, 


Js. Depen 


against 
maladjustments- 


tween par ane 
be d which distin- | 


te 


ds 
clinical evidence 


ale scores 
]ly diagnose! 
ow T 


94 


For parts, 
.18-.88 


.90 


.84-.97 for 
parts 


.84-.92 for 
the divi- 
sions; .95 
for the 
whole in- 
ventory 


Stanford 
University 
Press 


Psychological 
Corporation 


Psychological 
Corporation 


Public School 
Publishing 
Company 


Psychological 
Corporation 


Educational 
Test Bureau 


482 PERSONALITY INVENTORIES 


d 
ANS : t include 
In Table 17 there are described personality inventories no 

in the body of the text. 


D ES 
THE VALIDITY OF PERSONALITY INVENTORI 


intro- 
чаг $ e been inl 
Indications of the validity of personality inventories hav 


= : idence 
duced throughout this chapter. In the present Jenni pen ре used 
be brought forward to clinch the idea that such nn evadit 
with the greatest care. This evidence has been collected we used these 
Ellis.! He quoted directly from the investigators who 


SM 
— NEUROTICE 
TABLE 18. VALIDATION ОР PERSONALITY IwvENTORIES—NE 

OR INTROVERSION* 
(STUDIES or DIFFERING Types, Erus, 1946) 


Questionably 
Number | Positive} ог mainly 


positive \ 
кызу дец 


Negative 


1. Total 
1. By behavior problem. Diagno- ó 
sis. Subjects mainly children, . 9 2 1 13 
2. By diagnosis of delinquency. Y 34 15 6 
3. By psychiatric and Psychologi- 30 
SD MIRA 75 36 9 
4. By rating diagnosis, Ratings 
by teachers, friends, or asso- 22 
абеду, oe ace, Wes Sis apa ecd 44 12 З= 71 
i ip E ИНИ 162 65 | 26 
2. By inventories 11 
Bell adjustment inventory lest pss 12 1 9 15 
Вегпгешег Personality Inventory, 29 9 6 3 
Thurstone Personality Schedules. 10 4 1 14 
Woodworth Personal DataSheet| 29 11 4 m 
Other РОО у testa. i. Є" 82 40 1$ 4 7 
BB. nde ama qs. | ie Ге 


sel 162 AREE JD bo 
* By permission of Psychological Bulletin, À 


hen summ 


tr 

" а F His е5 
Inventories, and t arized and quantified the oap. re 
ment covers for the most part those studies which utilized i ult$ 
claiming to test neuroticism о ic 


1Ellis, Albert, “The Va 


schol 
lidity of Personality Quéstionnaires;” Р 
Bulletin (1946) 43:385-440, 


MEASUREMENT OF PERSONALITY TRAITS 483 


which were derived from the actual application of inventories to real 
life situations. In behavior-problem diagnosis, inventories were ad- 
ministered to groups of behavior-problem children and their results 
compared with those secured from normal groups. In diagnosing 
delinquency, test results of delinquents are compared with those of 
normal groups. In psychiatric and psychological diagnosis, results from 
case studies by psychiatrists or psychologists are compared with results 
from the inventories. In rating diagnosis, teachers, friends, or associates 


rate individuals and these ratings are compared with scores on the 
otals in the table it is evident that out of 


inventory. By inspecting thet 

162 studies only in 65 cases did the inventories clearly differentiate 

between the groups studied. The author concludes (page 426), 
It is concluded that group-administered paper and pencil personal- 
ity questionnaires are of dubious value in distinguishing between 
groups of adjusted and maladjusted individuals, and that they are 
of much less value in the diagnosis of individual adjustment or 


personality traits. 
t does not subscribe entirely to such 
s, he realizes the impor- 
from the results of the 
dren and high school 


While the author of this tex l 
devastating criticisms of personality inventorie 
tance of extreme care in the inferences drawn t 
administration of personality inventories to chil 


Students, 
RATING SCALES 


Another procedure used for securing quantitative expressions of 
Personality Кав is that of rating. Aspects of rating Série. bem 
apparent in the neurotic inventories, 1n the cr meme. Sem 
and in the expressions of attitudes. But in eac Peers me no 
cedures the rating was largely e m d Ше the time?” by mark- 
oe ав“ о you feel Qa FU elf upon the possession of 
es," "No, or ng his attitude by indi- 


this trait. Again, when 4 t he is rà 
“ating his degree of belief in а e statement, 

е, when he indicates his hotels, and s 
x egroes in trains, restaurants, 1, or —2, in which 2 indicates 
*quired by law” (voting ^ >? ent is true, and —2 an equally strong 


E . . Ж, 

ud conviction that the Mi rith the other numbers indicating 
s false, W = itud 

himself on an attitude 


Conyicti n " 

: ion that е statemer `. is rating 

E zi ef), he 15 | is liki 

ae x positions pr n individual expresses his liking for, 
- In like manner; 


1 
Hunter, op. cit, 


84 PERSONALITY INVENTORIES 
4 

i islike of 
i rence of, or dis с s 
d set! he is rating himself їп interest 
self-rating is worth while ha 


ever, 


i 
n which a self-rater mes e 
placing himself in an My 
have none of this at a " sonality 
or independence of pe оће! 
trait has been rated, can d by 


ve 
ly ha 

Y are really related or mds 

Se of the influence or the rater's atti 


TYPES OF RATING SCALES 
The forms which 


the 
i ve 
these Scales take are intended to a es E 
accuracy and Case of rating and to help the rater put his 1 
quantitative form, 
In the fir 


st form а line, usually ap, 
divided into five or 


1 ore equa] diy 
division is placed a Verba] descript 
possessed. i 


A good illustration Of this t 
Olson-Wickman Behavior Rating sc 


! Strong, Е. K, 


t 5 
аай. 1 

e г 
Out five inches long, is di line? 
sions and underneath " the t? : 
ion of that amount o наве 
Уре is Schedule В of the 

hedules.2 


jogic”! 
ho! 
» Vocational Interest Blank for Men. New York: ад 
Corporation, or Stanford University, Calif. Stanford University Pre: 

? By permission of World Book Comp, 


any, Yonkers, N.Y. 


MEASUREMENT OF PERSONALITY TRAITS 485 


6. Is he mentally lazy or active? 


| | | | | 


Interests Lethargic Is ordinar- Ea 
А ger Shows h - 
lazy and inert idles along ily active a d 
y 
(5) G) Q) m (4) 
11. What is his physical outpul of energy? 
Extremely Slow in Moves with Energetic Overactive 
sluggish action required speed Vivacious Hyperkinetic 
Meddling 
(5) (3) Q) (1) (4) 
24. What tendency has he to criticise others? 
Never Rarely Comments on Has a Extremely 
criticises criticises outstanding critical critical, 
weaknesses or attitude rarely 
faults approves 
(2) (4) (5) 


(3) (1) 
One’s rating is indicated *by placing a cross (X) immediately above 
the most appropriate descriptive phrase.” The numbers at the bottom 
are used for scoring and are derived from the behavior scores which the 
individuals so rated received. For example, in No. 11 the children rated 

behavior score of 44.9 on 


as extremely sluggish “had an average | 
Schedule A" while the overactive ones averaged 27.1. For this reason 


the first descriptive phrase in No. 11 is rated as 5 and the last one as 4. 
To secure a behavior score for an individual, simply add up the scores 
of the scales on which he is rated. The larger the score, the greater the 
number of personality difficulties. | 

There are many variations of this type (of rating scale). In some 


Scales the line is continuous; the points are defined, but one is permitted 
by checking at any point along the 


to make i iate jud. ts 

К intermediate judgmen й 
line. Another variation is simply to have the line and numbers at the 
division lines instead of descriptive phrases or else the same phrases at 
€ach division in every trait. For example: 


| 


"ROMA uw Somewhat Hardly Not at all 
Still another attempt at more exact definition of the trait is as 
Ollows 1 


1р; " “Progress in Civil Service Tests,” Journal of 
> Filer, H. A., and L. J. O'Rourke Items by permission from Personnel Journal. 


*rsonnel Research (1923) 1:484- 


486 PERSONALITY INVENTORIES 


-r 


Attitude toward - 
work: бабага Shows 
Consider vol- | Uncon- Interest Average n (tort interest 

untary cerned and effort | interest ande and whole- 
interest and | and no below and effort above e hearted 
effort in voluntary average SYSISE effort 
work effort 

Neatness: hat | Excep- 

Consider Disorderly Somewhat Average Somewha tionally 
orderliness below orderliness] above orderly 
in work average in Ryerage - 
orderliness orderlines 


t no 
ked bu 

In another type, there аге lists of statements to be chec 

line with its five or more divisions:! 


to 
t part 
The scale describes a set of situations related for the most р 

the social adjust 


ituation are 25 
ment of the pupil. Samples of the situation 
follows: 


А iscussion- 
I. Involves taking turns on apparatus or in group disc 

IV. Child has à Social task to be completed. 
ҮП. Child faced with failure. 


ХШ. When things must be organized for work. more 
ive à 

this last division with weights attached give 

w the checking is done. 

must be organized for work: 


hat wor 
Gets things he needs together ahead of time so t 
Боез smoothly, 


These samples 
istics of these instruments, All 
1 Van Alstyne, Dorothy, “A NewS 


р 
, racte a 
S exemplify the leading ch pre 


I 
three rating scales are carefully F ез 


-— АМ. 
1 i ol Behavior an5 cr 
ín the Elementary School," Journal of пааша оза (1936) 27 He 
Quoted in Jordan, A, M., Educational Psychology, 3d ed., p. 565. New pany: We 
Holt and Company, Inc., 1945. By permission of Henry Holt and COP... He 
? Jordan, A. M., Educational h 


ork: ne 
Psychology, 34 ed., pp. 562-563. New VS ny» s 
Holt and Company, Inc., 1942 By Permission of Henry Holt and 


MEASUREMENT OF PERSONALITY TRAITS 487 


pared. The traits to be rated are accur - meti 

defined. The division points are made өле. eee Sine ра 
means of words signifying different amounts of the ee ae 
rated. In the best of these scales demarcation points are € Ы А 
by means of some ubiquitous selection of words such as Wr 


average, as applied t 


o the traits being rated but rather are made to 


stand out by words indicating a certain nicety of distinction such as 


defiant, criti 
descriptive expressions aid the r 
otherwise impossible to 

"There are other chara 
these samples. 
tative aspects simply by designati 
second as “2; etc. These numeri 


combining one individual’ 


cal of authority, ordinarily obedient, etc. These exactly 
t ater in recognizing the differences 
distinguish. 

cteristics of good rating scales apparent in 
Scores placed along these lines may be given quanti- 
ng the first division as “1”; the 
cal records give opportunity for 
s scores on several traits. In the third 
under the influence of careful selection 


place, there is a tendency 
te description to make the judgments 


of traits and their accura 
themselves more analytical so t 
broken up into mu 
material of the sca 
again emphasizing the care ut 


hat gross total characteristics are 
ch smaller traits. Finally, you will notice that the 
les is given a permanent form in printing, thus 
ilized in their construction. 


scales illustrate the variation upon а 


These excerpts from rating 
ade. In general, they follow the rules of 


central theme which can be m 
good scale construction pretty с 


6. 


. They fail in one recom 
is a tendency for ratings 


. Not more than seven divisi 
. Divisions reinforced by care 


1 
2 
3. 
4 
5 


. Extremes not so far dis 


losely: 


ons of the line 
ful verbal descriptions 


A continuous line 


- Simplicity of administration 
from the mean that nobody will use 


tant 


them 
ms easily understood by the rater 


Descriptive ter 

ich is worth considering. There 
ear the average when there is 
ut average” gets 
for statistical as 


mendation wh 
to be made n 
nsequently the division * abo 

are unwieldy 


doubt з 

and uncertainty, © 

Suc | кув that they 

a а large number oí rating this reason it has been recommended 
poco the extremes in a five- 


that as for practical purP i 
sine n 
the two divisions betWee < «gian than to 


he median an 
the extremes. This 


Point sca] e 

- e be placed nearet © ai над 

Would make Шы line look something * B | 
i | | | 


| 1—11 
l 


488 PERSONALITY INVENTORIES um 
e 
$ amounts represented on the sca Jeast 
Ataer estat de os 
A "te wide amount; and 5, the greatest m nash the 
timen) cooperation, honesty, emotional cisci dn dime Е - 
rater usually translates those numbers into cp n puts down А 
own which characterize the trait being rated an ране lost i n this 
proper number. While something of accuracy is E Ar each B а 
procedure, it offers a practical way to get many r s using ihe gn 
vidual. After an individual had made many rating on a S-inch line; 
complete scales with their verbally described divisions 


he 
beri ыы small. 
the amount of error made on a method of this kind is Feigao etc 
author has used this procedure in rating cooperation, in 


with satisfactory reliability. 


SAMPLES OF RATING SCALES РРР 
а , а 
Thus far samples of techniques which are used in rating 
presented. To complete th: 


А es in 
is exhibit two or three rating scale 
entirety will be presented. 
The Haggerty-Olson-Wickma 


„деа into 
n Rating Schedules are divide 
two parts, Schedule A and Schedule B, 

Schedule A consi 


never 


jonà 
occurred,” “has 


Occurred once 
occurrence,” or ‹ 


à » «occas 
ог twice but no more, is weg 
‘frequent occurrence.” Each one of these i 
differently as follows: pu 
4 
ore 
Has occurred Occasional Frequent Sc 
never once or Occurrence occurrence 
occurred | twice but 
no more i 
Disinterest in school Work... 0 4 6 1 
ТУШЕ... аа 0 4 б 4 
Temper outbursts... 0 8 12 14 
Imaginative lying... | 0 12 18 4 
Obscene notes, talk, or pic. 1 
ҮШ... rene a 0 12 18 2 " 
1 
'These items had been sele 


1 Items by permission of World Book Company, Yonkers, N.Y. 


MEASUREMENT OF PERSONALITY TRAITS 489 


and some 30 or 40 mental hygienists about the seriousness of traits 
when they occur in children at certain ages. The instructions are “Put 
а cross (X) in the appropriate column after each item to designate how 
frequently such behavior has occurred im your experience with this 
child. . . . The numbers are to be disregarded in making your record.” 

The nature of Schedule B has already been indicated in the three 


samples taken from it on page 485. 
The reliability coefficients are reported only for the 35 rated items of 
Schedule B. The reliability by the split-halves procedure is .92. When a 
tings of different teachers it turns out 


correlation is made between the ra 
to be .60, and between one teacher’s rating and the average of the ratings 


of three or four teachers the coefficient is .70. If reliability were com- 
puted as with other measures the same rater would rate а group of sub- 
jects the second time. This procedure undoubtedly makes the reliability 
too high on account of the memory factor. When a rater has once rated 
James Sewell, for example, he will on the second rating give him nearly 
the same position as he did at first. On the other hand, the correlation 
between ratings by two different raters are probably too low. The same 
conditions are not repeated because of (1) the different experiences the 
two raters have had with the subject, and (2) the differences in the set 
Or attitude of the two raters. It would seem therefore that rerating 
gives a too high reliability coefficient and ratings by different raters, а 
too low one. The coeficient somewhere between the two more nearly 
approximates the truth. The true coefficient in this case falls perhaps 
between .60 and .92 or in the neighborhood of 15 or 80.. 
This same difficulty appears in the reliability of all rating scales. 
The validity may be measured by correlating the ratings of one rater 
With a group of four or five raters. “The validity of the Behavior Rating 
nical cases, and the sub- 


Scal died by means of ratings, clir 
ne ed ane « А composite score on Schedules A and 


ith whi f children were 
B co e ith the frequency with which a group o 
i ae a and monitors to the office of an elementary school 
Principal Ў bs was also demonstrated that half the cases referred to child 
guidance clinics fell into the highest 10 per ee bo нак 

P 2 :ng to the ratings of te ic A 
Ec eg e a -Wickman Behavior Rating 
e manual O 


Haggerty-Olson 
Schedules warns against ratings whic lowed by an attempt 


h are not fol б 

Hate or correct the соп itions 
to st further and to allevia отте 
Mrd Ned best feature is the careful description of the 15 prob- 
lems which орке Schedule A. For example, Speech. difficulties. 
Under this heading inc ng or stammering, 


the substitution 
of one sound for another, 


s indicated by pro- 
А letters or sounds." 
Douncing letters or sounds incorre 


ctly or by slurring 


490 PERSONALITY INVENTORIES | E. 
. LH 1, О: 
Norms are provided in the nature of tables d is рне: 1,065 
scores for Schedules A and B, and percentile ranks «A wer nishe 
to 2,867 cases. Furthermore, similar tables and percen ye emotional 
foreach of the four divisions: intellectual, physical, socia e Attitudes! і 
The Winnetka Scale for Rating School Behavior т arable degrees 
made up of 13 school situations with seven more or less de 
of participation. Here are two illustrations: 


IV. When a child has a social task to be completed. 


5th 
Ist | 2nd | 3rd 4th 


Carries task to completion even by sacrifice of other 
interests, (10) 


Carries task through by steady effort even though it does 
not harmonize with Special interests. (9) 

Carries task through only when it does harmonize with 
Special interests, (6) 


Carries task through although application ї 
Drops task— lose. 


5 interest quickly, (1) 
Tries to escape task by contrary behavior or by shifting 
jobs. (0) 


VIL. When faced with failure, 


is unsteady. (3) 


5th 


th 
ist | 2nd| 3rd | 44 |" 


a] 


— 


MEASUREMENT OF PERSONALITY TRAITS 491 


"eec qi ue of qup PEE consciousness, emo- 
| | Я р, ап responsibility. Three situations ar 
combined into one of these dimensions. For example, under “соо ° 
tion” come ratings on: (1) taking turns with apparatus or ме ав 
іп а group discussion, (2) carrying out а group project, and (3) via 
facing a social situation involving sacrifice of own interests or needs to 
those of group. By averaging the three deciles received on each situation 
composing the dimension a percentile score may be obtained for it. In 
this manner a profile may be secured for each year of rating with per- 
centile positions on each of the five dimensions. 

_ The rating scale was carefully constructed with preliminary observa- 
tions and ratings by means of which corrections in language and scoring 
were made. The final norms were secured from the ratings of some 1,200 
children. The reliability based on the rerating of the same children after 
2 to 8 weeks was .87. The categories also were fairly reliable, varying in 
their coefficients from .12 to .82. The correlation between these ratings 
and ratings secured through the Haggerty-Olson-Wickman Behavior 


Rating Schedules was -71. 

The Personality Rating Scale for Preschool Children! of the Merrill- 
Palmer School consists of items to be checked in nine different dimen- 
sions: ascendance-submission, attractiveness of personality, compliance 
with routine, independence of adult affection and attention, physical 
attractiveness, respect for property rights, response to authority, 
sociability with other children, and tendency to face reality. It was 
developed especially for the nursery school and has its reliability com- 
puted only for this age although it has been used somewhat with children 
of school age. Each of these divisions has a list of items which the rater 
simply checks. These items are descriptive of simple habits. For exam- 
ple, under ** Compliance with Routine” appear such items as “acts silly 
at lunch table,” “refuses many foods," “ dawdles over routine activity. 
Percentile scores are available for different age groups. In the ascend- 
ance-submission category percentiles are available for (1) months 24 to 


47, (2) months 48 to 143 and (3) months 144 to 203. 
In Table 19 there is a list of other rating scales. 
SUMMARY 
personality inventories has been kae 
i ibili *tv of the traits have 
eats intangibility and complexity 0 | 
| hans satisfactory analysis. When the total personality 
into measurable traits there was no assur- 
udy of Personality 


tsman Ball, “A St 3 
f Rating Scales," Journal of Genelic 


Our quest for satisfactory 


Partially realize 

In part prevente 

complex has been broken UP 

. 2 Roberts, Catherine Ellis, and Rachel Stu 

А. Young Children by Means of а Series О 
Sychology (1938) 52:79-14% 


PERSONALITY INVENTORIES 
492 
жт 
Taste 19. List or RATING SCALES Nor IxcrupED IN TE 


iab Publisher 
Name Grade Contents and validity Reliability 
a y 


Rating Scale for | Upper 


Attention, neatness, honesty | None given 
School Habits grades and interest, initiative, ambition, 
(E. L. Cornell, high school Persistence, reliability, and 
W. W. Coxe, stability. All nine scales con- 
J. S. Orleans) tained on one Page. r = .55 


to .75 with school marks 
American Council | 9-13 


erican 
Before rating make observa- pe on 
on Education tions of subjects. Report in- Education 
Personality stances that Support rater's 
Rating Scale 


judgment. The descriptive 
scale (B) includes five traits: 
industry, ability to control 
others, appearance and man- 
ner, emotional control, distri- 
bution of time and en 


ergy. A 
is a graphic scale Harvard. 
BEC Personality 7-16 Rates eight areas of person- Universi 
Rating Scale ality: (1) mental alertness, (2) Press 
(Business initiative, (3) dependability, 
Education 4) Cooperativeness, (5) judg- 
Council) ment, (6) persona] impression, 


(7) Courtesy, and (8) health. 
Each one broken down, For 


under dependability 
: (1) trustworthi- 
ness, (2) persistence, (3) punc- 


tuality, (4) obedience to rules 
Vineland Social 


Training 
Infancy The 117 items of the scale are School, 
Maturity Scale through arranged in order of average vinelan® 
adulthood age norms and are numbered N.J- 

in arithmetic succession from 

1 to 117, The groupings of 

items at age follow pretty 

closely the Pattern of the 

Binet tes 


ts. User of scale 
ining in its use. May 
Social ages, Author 
is nota rating scale 


needs tra 
Compute 
claims it 


3 fact Partially measuring another- 
progress seemed possible onl 


ce 
chant'q) 
Two major methods were discovered which offered some а( 
securing improved measurem, 


nts: (1 self- f-report a js 
hi :( ) lf. rating or self-report e pi 
ratings by others. In self-rati 


MEASUREMENT OF PERSONALITY TRAITS 493 
own reactions to situations carefull indi 
some personality traits. It was Cogs a ee o 
E ve: would indicate the presence of certain сее: boss 
ity such as dominance-submissiveness or neuroticism. A E 
quence, questionnaires or, more technically speaking hrs des wer 
аваат on which ап individual could register the im danh 
2 ү But even here difficulties arose such as those which had to 
RA x the willingness or ability of an individual to disclose his inner 
. Samples of these inventories were presented which avoided 
least ameliorated the effects of some of these errors. e 
Pae second method, that of rating by others, avoided at least the error 
of self-favoritism but added some of its own: inadequate observatio 
failure to define clearly the trait being rated, and errors due to нене 


bias. So ubiquitous were these errors that at least three raters were 
Its. Some improvement in rating was 


necessary for dependable resu 
achieved by using à continuous line below each of whose division points 
the amount of a trait was verbally described. 

f personality traits it is well to 


In this area of the measurement o 
interpreting the findings as tentative 


emphasize the great necessity of 
a is the need so great for gathering all 


and inconclusive. In no other are 
the available data about а subjec 
inventory or rating scale into the 
the results of the inven 
furnish, if properly interprete 
total personality. 


1. How are self-inventories Con- 

Structed ? 
2. Explain the fundamental difficul- 
in self- 


чы and sources of error 
entories. 
К Describe the leading 
of the Bernreuter 
hventory, 
so A hy is the vali 
ifficult to determine? 

E. What characteristics of the Bell 
Justment Inventory 
emselves for practical use? 

m а. Secure a Bernreuter OT Bell 

ventory and take it. Answer the 

стега as honestly as you can- Score 

жыш the results. How do these 

of ae agree with your understanding 
е presence of these traits in you? 


characteris- 
Personality 


dity of inventories 


recommend 


tories and ratin 
d, capital ai 


t and then introducing the results of 
total picture. Under such c 


onditions 
g scales are invaluable. They 
ds in the interpretation of the 


QUESTIONS AND EXERCISES 


b. Would such a procedure in 100 
cases be one method of studying the 
inventory's validity? 

7. How does Bell's Inventory differ 
from that of Bernreuter? 
8. How do you account for the wide 
use in schools of the California Test of 

? Do you think the name 


Personality 
“test of personality” is a correct de- 


scription of this instrument? Why? 

9, From the statistical point of view, 
why is it dangerous to depend too much 
on scores on the various dimensions of 
personality obtained from this test? 
What is meant by the overlapping of 
categories? How is this overlapping 
measured? 

10. List the inventories constructed 
for use with younger children. What 


PERSONALITY 
494 

i ties are present in evaluat- 
= ers traits with these 

е ? 

в 4m the Roger’s Test of 
Personality Adjustment. How do the 
questions differ from those already 
described? How was it validated? 

12. Name three other instruments for 
measuring personality traits. What 
traits does each Propose to measure? 

13. How does the rating scale differ 
from the self-inventory? What are the 
leading characteristics of a good rating 


INVENTORIES 


e types 
scale? Name and illustrate thre! ! 
ting scales. А сез 0 
р 14. “What are the leading vrocedure 
error inherent in the rating P cteristics 
15. Describe the шаш i Rating 
of the Hagerty-Olson-Wickm п of their 
Schedules. Include a discussio 
reliability and validity. А 
16. To what uses could | 
be put in a progressive scho Winnetka 
17. To what uses could the les of the 
scale be put? The rating sca 
Merrill-Palmer School? 


scales 


BIBLIOGRAPHY 


Books 


BELL, Носн M.: Th 
Practice of Personal Cou 
ford University, 
Sity Press, 1939, 


Buros, Oscar K.: The Nineteen Forty 
Mental M. easurements 


Yearbook, pp. 
1198-1245, Highland Park, N.J.: The 
Mental Measurements Yearbook, 1941, 
— 7 The Third Mental Measure- 
ments Yearbook, PP. 23-114. New Bruns- 
wick, N.J.: Rutgers University Press, 


€ Theory and 


unseling. Stan- 
Calif.: Stanford Univer- 


1949, 
Скомвасн, Lee J.: 
Psychological Testing, Ch 
Report Techniques; Perso; 


Essentials o 
ар. 14, “Self_ 
пау,” Сһар. 
hniques,” New 
others, 1949, 
- C.: Factor Analysis in 
ality. Stanford Uni- 
Саш.. Stanford University 
Press, 1935, 
GREENE, Epwarp B.: Measurements 
of Human Behavior, Chaps. 17, 18, 19; 
“Modes of Adjust 


ment.” New York: 
The Odyssey Press, Inc., 1941, 


Super, Donar E.: A bpraising Voca- 
tional Fitness, Chap. XIX. New York: 
Harper & Brothers, 1949. 

SvwoNps, P. M.: 
sonality and Conduct, 
ing Methods,” Chap, IV, “The Ques- 


tionnaire,” Chap, y, “Adjustment 
Questionnaires.” New York: Appleton- 
Century-Crofts, Inc., 1931. 


Diagnosing Per- 
Chap. III, “Rat. 


Articles 


sust- 
faladjus 
Dartey, J. G.: “Tested Ma позе 


5 

H agr ч 
ment Related to Clinically D'E pied 

Maladjustment,” Jum 
Psychology (1937) 21:632- МАТ 

Erus, ALBERT: 35 

Personality Questionnaires, б. 
cal Bulletin (1946) 43:385—44 " 
Filer, Н. A., and L. d te 

"Progress in Civil Service Ta 
nal of Personnel Rese " 
Qo Y ба «qechnic?,. of 


р » Jour 1^ 
pects of Multi-trait Tee iss) 6:64 

Educational Psychology ( кю: 
651. ILFÓ' 4 


GU 
Guirronp, J. P., and xm е 
“Personality Factors S. "nal of 
Sir Measurement," Jot 
chology (1936) 2 ^ P 
Hataway, S. Ra “T 
nventory as an Aid in th 
Psychopathic Inferiors, 
Consulting Psychology К. 
117. A. JO 
ТАвутЕ, L. L., and A> tor 
"Does the Bernreuter Tuy Educ 
tribute to Counseling? 9, їс 
Research Bulletin цо s чүш ust 
LANDIS, CARNEY, ult 00 
нА of Three Personality ^ p jy 
ment Inventories," opi 
tional Psychology (1935) 
ROBERTS, CATTER 
RAcHEL Srursman BALL: 


MEASUREMENT OF PERSONALITY TRAITS 


of Personality in Voung Children by 
Means of a Series of Rating Scales," 
Journal of Genetic Psychology (1938) 
52:79-149. 

SPEER, С. S.: “The Use of the Bern- 
reuter Personality Inventory as an Aid 
in the Prediction of Behavior," Journal 
of Juvenile Rescarch (1936) 20:65-69. 

Ѕтосрпл, Emmy, and Міхмк E. 
Tuomas: “The Bernreuter Personality 
Inventory as a Measure of Student 


495 


Adjustment,” Journal of Social Psychol- 
ogy (1938) 9:299-315. 

Super, Donatp E.: “Тһе Bernreuter 
Personality Inventory: A Review of 
Research,” Psychological Bulletin (1942) 
39:94-125. 

Van AtstyNeE, Dororuy: “A New 
Scale for Rating School Behavior and 
Attitudes in the Elementary School,” 
Journal of Educational Psychology (1936) 
27:677-693. 


PART FOUR 


Statistical Methods 


CHAPTER 19 


Statistical Methods 


M eene i book there has been continuous reference to statisti- 
sn pts an statistical procedures. For this reason the treatment 
re is in the nature of a summary and elaboration of statistical d 
ore concretely, statistics has been used ( D in 
in the interpretation of results. In 
tion has been made of norms, 


cepts already familiar. M. 
the construction of tests, and (2) 
constructing and standardizing tests men 
percentile or standard scores, reliability, and validity. In the inter- 
чаш of results, if complete use is made of the data, mention must 
e made of tables of distribution, the accuracy of the results, and the 
le or standard scores. A day other 


Mes г 

ҮСЕШЕ of scores such as percenti 

miscellaneous concepts such as the standard error of estimate and the 
nfluence o 


formula for interpreting the i f range on correlation have 
шее: For the student to get the best results he must follow point 
y point the treatment in the text and work out the problems intro- 
duced in the exercises at the end of the chapter, as well as answer all the 


questions there proposed. 
The following statistical concepts are developed: 
1. Measures of central tendency 
a. Median and other percentiles 
b. The arithmetic average or mean 
c. Mode 
2. Measures of dispersion or scatter 
a. Standard deviation, T-score, and standard scores 
b. Probable error (Р.Е.) 
c. Semi-interquartile range (Q) 
d. Average or mean deviation— 
е. Advantages of standard scores 
3. The coefficient of correlation 
4. Pearson product-moment 
b. Spearman rank-difference 
4. Interpretation of coefficients 0 
- Uses of correlation coefficients 
4. Reliability 
499 


-mentioned but not computed 


correlation method 
f correlation 


500 STATISTICAL METHODS 


b. Validity 
c. Prognosis 
d. Test construction 


ard 
6. Sampling—standard error of the mean and of the stand 
deviation. 


ASSEMBLING THE DATA 


92 88 97 95 (100) (58) 90 
94 72 91 83 в 83 и 


95 86 
85 89 77 61 74 59 85 
86 71 9s 90 92 62 80 
91 90 66 63 85 71 78 


In order to Construct a table of distributi i rest nu 
istribut sst and low с 
bers must be found. t aa | 


: The highest number is 100 and the lowest, 58: 
difference between these Scores, call 


or steps of 1 there would 


ew 
e 42 st i ome some" pe 
unmanageable and уо EPS, which would bec es of th 


in 
á em 1 
th 0. 


vals 


jon 
А ibutio 
16 construction of our table of distri s 
the mean; 


в. 

tional and Г? aSurement is апу desired closeness to ous ча 
г; ttional 0 ;, іти , dis- 
Educational and psycho cal measurements are usually contint dis 


the 
! A good rule to follow is 10 " 15. In fers 
; to 20 st intervals. 1 <t pre 
smaller the interval the more accurate the ang ду анн of this text Ру 12 
about twelve intervals for Ordinary Work. In that case, we divide the rans' 
which will give us the size of Interval, This will result i 12 or 13 intervals 


ral, 
to use from gene 


STATISTICAL METHODS 501 
or VOCABULARY SCORES 


TABLE 20. DISTRIBUTION 
dian and percentiles) 


(Computation of the me 


Scores Tallies Frequency (f) 


100 (99.5-104.4) 
95 (94.5-99.4) || 
90 (89.5-94.4) || 
85 (84.5-89.4) | 
80 (79.5-84.4) || 
15 (74.5-19.4) 11 
70 (69.5-74.4) || 
65 (64.5-69.4) |1 
60 (59.5-64.4) | 

55 (54.5-59.4) | 

N = 42 


es ee 


а. Median = 50th percentile = 85.6. 


тә > кю GR OQ) (ROO о л 


e number of cases at the 
point in the interval in 


NETT c s nt 


Start at bottom, 2 
the median falls. 84.5 15 the lowest 


Interval in which 
Which the median falls. 


^ Q = (Qs — Q)/.&5* 
Qs (75th percentile) 


he 75th percentile; Qui the 25th percentile. 
= 91.69 


75 a dd Rd on d RT 
pa gre dm ve p is the int in the interval in which 


89.5 + [(31.50 — 28)/85- 89: | 
py ү d cases in the interval in W 


hich Q; falls. 


Qs falls, 8 is the numbe 
, (25th percentil )- 72.62 
25 per cent N = 1 E 
= 69.5 + [010.5 — 8)/4]5. 69.5 is the lowest 
s in the interval in 


St 2 = 8. 
art at bottom, 2 + 4 + ^ falls. 4 i$ the number of case: 


ee of interval in which Qi 
Which Q; falls. 
91.69 — 72:62 = 1937 „053 


might stand for 95 to 


t 

граны ow pm ‚же ; ood illustration of this X ovn 
In the d Hle insurance ompanies 1n sangue € ч is RE 
а person 17 о r age ÎS t of, as is ОГ ys af ia 
having dues at his ast pirthday and as reaching is 


5 to 95.49. It 


502 STATISTICAL METHODS 


e to 
ar is finished. Life-insurance companies compute uh г 
ана ау. Age 17, then, extends from 16 years кн d d 
to 17 years and 5 months. In brief, 17 is 16.5 to 17.49. и ааб 
more nearly approximates the truth than does the computa 
from the last birthday. — 
Let us now proceed to construct our table. It must extend d 100 ЁЛ 
to include 100 and low enough to include 58. We now start n and Sil 
drop by steps of 5 to 55, i.e., our table must include both 1 74.5 up t 
By definition the 55 stands for 54.5 up to 59.5, 75 stands for 74. 


А os from 
79.5, 80 stands for 79.5 up to 84.5, etc. At 80 are included score 
80 through 84, at 85 are 


fer our 42 scores on pag 
tally entered in the pro 
table at 90, for score 9 


MEASURES OF CENTRAL TENDENCY 
The measures of central tendency are median, mean, and "- 
these three the median and mean (arithmetic average) are Ч 
frequently while the use of the mode is rare, 


оде. of 
d very 


MEDIAN anp OTHER PERCENTILES 
The median is d 


t 
z : is also 
) It is evident that the mid-point is ds 


how far up the scale this number extends. By observing Tab © pegi” 
г y obse x 

find N is 42, N/2 jg therefore 42/2 or 21 (the half sum). We 2° фе 

at the bottom of the frequenc 


greatly affected by extreme Scores, 


STATISTICAL METHODS 503 


Percentiles 

It was pointed out in computing the median that it was the 50t} 
percentile. In computing the median we simply compute 14 or 50 E 
cent of the scores and discover where this number falls itte Tae 
distribution. Exactly the same procedure is used in computing any 
percentile. We take the percentage of the cases we desire and Ended 
by interpolation its exact location. Thus for the 10th percentile we take 
10 per cent of the cases and interpolate, for the 20th percentile we take 
20 per cent and interpolate, etc. In this manner it is possible to compute 


any percentile from 1 to 100. 
Computation. of Percentiles 
To compute the 15th percentile, take 15 per cent of N, here 42 
(Table 20). This is 6.3. Count up from the bottom of the frequency 
column until you come to the interval in which 6.3 ends. In Table 20 
this becomes 2 + 4 and .3 is left over. The 15th percentile is then 
64.5 + [(6.3 — 6)/2]5 = 65.25. The 6 in the numerator is the sum of 
the cases below interval 65. The number 64.5 is the lowest point in the 
interval 65, The number 2 indicates the number of cases at interval 65. 
To compute the 65th percentile, take 65 per cent of 42, which is 27.3. 
Count up the frequencies in Table 20 until the next step contains the 
last of 27.3 as follows: 2+4+ 2-E4--3--4 = 19. Now there 
are 8.3 cases left which ned in the 9 at 85. Computing, 
84.5 n 9]5 1 = 89.11. Thus we see 84.5 is 
the has dm i inter he 65th percentile falls, 27.3 
is 65 per cent of 42, of the cases up to interval 85, and 9 are 
the cases evenly distributed over g5. You may check your understanding 
of the procedures by comparing your computations with the following 
answers: 40th percentile = 81.75; 70th percentile — 90.37; ist per- 
centile — 55.55; 25th percentile = 72.62; and 75th percentile = 91.69. 
Percentiles furnish points of reference In the norms of a large number 
of tests, When tests were first standardized the usual percentiles com- 
Puted were the 25th, 50th, and 75th. But, as experience increased, the 
need was felt for further points of comparison, 7.€-5 percentile points, all 
up and down the line. In interpreting such percentile points one must 
remember that the 25th percentile simply means that 25 per cent of the 
cases are below that point while 75 per cent are above it, and that the 
eighty-fiith percentile means that 85 per cent are below that point and 
5 per cent above it. 
Tug ARITHMETI 


The most familiar measure 0 
average or the mean. It is comput! 


are contair 
= 84.5 + 4.6 
val in which t 


c AVERAGE OR MEAN 
f central tendency is the arithmetic 
ed by adding up the quantities and 


504 STATISTICAL METHODS 
dividing the sum b 
and divide the sum 
the mean may be 
interval as the me 
tion, thus arrivin 


übers 
У their number. Here we could add up ui УЕ 21, 
by 42. When the data are grouped as in yl of some 
computed by first assuming the — correc- 
an and then adding or subtracting the ti 
5 at the mean. Table 21 indicates the process. 


TABLE 21. COMPUTATION OF THE MEAN 


Mid- Р 3 fa 
points = 
102 | 100 (99.5-104 4) 1 4 4 
97 | 95(94.5-994) | 5 3 15 (44 
92 | 90 (89. 5-04 4) 8 2 16 
87 | 85 (84.5-89.4) | 9$ | 4 3 
82 | 80(79.5-84.4) | 4 | 
Ti 75 (74.5-79.4) 3 =i =3 
72 | 70 (69.5-74.4) & | 5 E "T 
67 65 (64. 5-69. 4) 2 = mo 
62 60 (59.5-64.4) 4 —4 | —16 
57 55 (54.5-59.4) 2 | —5 | —10 
М = 4 


Mean = assumeq mean + Ci (correction x interval) 
SS 
C (correction) = 2/4 - Hg 1 


= — = 024 
мА N^ dg -g-9 
i (interval) = 5 


Mean = 82 + (.024)5 = 82.12 


In the computation of the 
by (1) the long method, 


“vided by 42 Bed 
449, which when divide тр" р 
ort method a mean; i ac 


d mean and m 
interval. The algebraic sum of th lan ed 
number of cases, then by the size of the interval ё assum) 
algebraically to the assumed mean. Table 21 shows that t ( 


bove of 
ee ee SB d contains Simply the number of steps # roduct in 
or below (—) the assumeq mean. Column fd is, of course, the P eat 
column f and column d. The 


2, apP 
Computation and answer, 82. an 
Table 21. e the т 


In this problem you will Note that the median is 85.6 whil 


STATISTICAL METHODS 505 
as pulled down by the six cases at 55 and 60. Ex- 
ding to their size in computing the 
affect the mean more than 


is 82.1. The mean w 
treme scores are weighted accor 
median. 71 is well to remember that extreme cases 


they do the median. 
Tue MODE 


The mode may be thought of as the “value in a series at which the 
greatest frequency lies." This value, as may be seen in Table 21, is 87. 


The mode is also calculated from the formula: 
mode = (3 X median) — (2 X mean). 


ould be (3 X 85.6) — (2 X 82.1) or 92.6. This 
y representative number for our distribution. 


N OR SCATTER 
Two questions stand n а table of distribution is in 
question: (1) what is its central tendency? (2) What is its dispersion or 
may be asked more informally: How 


Scatter? The second question 
closely around the central tendency are the cases grouped? Are they 
Packed in close, or are they scattered out until there is no semblance of 
Unity in the gro up studie > There are four of these measures which 
iffer in quantity but no 1.6. which differ as the meter differs 
тот the yard: 
* Standard deviation (S.D) 
. Probable error (P-E-) 
nge (Q) 


In our problem this w 
does not веет to be a ver 


MEASURES OF DISPERSIO 
preeminent whe 


t in quality, 


- Semi-interquartile rane”, (A.D) 
: Average or mean dps A ий Q are equal. The standard 
е = 0.6748 S-D.)- 


U : 
„ер normal conditions t^ Е 
Viation is larger than the others (Р.Р 
wpARD DEVI 
da 

è curve the stan 
trical direction includes abou 


one dir 
poe this distance out from 
a are unlike each 


ATION 
STA 

rd deviation is the dis- 
t 34 per cent 


I 
n а normal or symme 
the mean is 


an " 
ce from the mean whi 


( 4 imes 
13) of the total cases- mL of the populatior 
ye icates closer resemblances among 
th a large standard 


атре, in which case the meme. + 
dud sometimes it is smal, ar а А class wi 
evan of pae өндү d be heterogeneous; x yer. small 
Standard in inte ige cus йн with respect to intelligence. 

deviation, hom’ deviation 15 computed. 
edure is used as in 


ab 
` 22 shows how the 812062 tion the same pio» he fd 
computi tandar -S ded to compute the 2 
Co puting а stà ae is needec P 
"puting the mean. Additional work 


STATISTICAL METHODS 


2Р0 TION 
TABLE 22. Computation OF THE STANDARD DEVIA 
E: poe o а 


Scores on word d fd? 
knowledge yi d f 
100 D d 4 ke 
95 5 3 15 
теа! 44 | 
90 8 2 16 i 
85 9 1 9 
80 4 
75 3 =i —3 1 
mor] 4 | ~2 E Е 
65 | MN ue —6)43 а 
O07 4| —4 | —16 5 
55 2 | -5 | —10 ET 
№ = 42 253 


Sum of fd? = {йт = 


cu 20 _1 


ERE сы 0245 
N 42 = -02 (or 
i = interval = 5 


S.D. | ме о] 


rA oes OE [d 


1 


N 
= Кузь Co3ysls 
2.45 X 5 = 1225 n 


Il 


column and then to Substitute 
the mean, plus or min 
cent of the total, Ino 
to 82.1 we get 94.35 and if we subtract 12.25 from the Dept 
н 07.85. Between these limits there are 28 cases or about i 
the total, 


rm 
ere DO, or 
in the formula. If our pae o 68. Ps 
us 1 Standard deviation, would inc 2. 


а re 
ЧТ case the mean is 82.1 + 12.25. If we he $C? 


r ce 


Tux PROBABLE ERROR (P.E.) 
In а normal curvy: 


dat 
А the stan pat 
i an Prob .6745 times 
deviation. The prob Probable error equals 0 


r 
“ is manne 
able error is so frequently used in this 

we shall not introduce other ways of computing it. 


E SEMI-INTERQUARTILE Rance (Q) ich 
The formula for the 


Qs, the third quartije 


STATISTICAL METHODS 507 


t has been frequently used and sometimes in 
ror (see Table 20). 


МкАх DEVIATION 

ing the deviations from the mean 
the time of day they would 
ther direction. If we simply 
deviation although 
core such as the 


imd computed that i 
ace of the probable er 


Tur AVERAGE OR 
This measure is computed by averag 


regardless of signs. If 25 students guessed 


mi 2 K EAS 

s the true time by уату!п& amounts in el 

t a these deviations we would have the average 
point of reference is the met 


on n rather than а true S 
e 
we have used here. 


Uses OF STANDARD DEVIATION 
One use of the standard deviation is of the greatest importance. It isthe 
tandard score = (X — Males; 


So- : 
9-called standard score- The formula 15:5 


So 
3 ac 30 2с 
E f З 9 50 
Add 50 40 -30 -20 10 $ e T, р 0 
Fig US zo 50 tee | е | 

+ 88. sigma units. Percentage each sigma value. ottom line, 


Normal curve, 


"Scores 
is the mean, and c; the standard 


wh 
e “te X is a single person's score, Mz ! » (78. Substituting i 
the ation (sigma). One member of our group Score 78. Substituting in 
© formula we get: 

Ls 341/1808 ==; 


ation units below the 


/12.25 


ua 82.1) 
tandard devi 


Th Standard score — 
Se of 78, then, Í 
О: . 
Се group. E of th standard score 15 the T-score. 
rigina] out of the со an attempt to develop units of mental 
т sure. this idea a be equivalent. S andard-deviation units de- 
ived fro ent which wow" о. f 12-year-olds (McCall) were treated 
Shown in Fig. 38 ern distance -vided into 10 equal parts. 
is 1р. 90. ne T ce 
ibd ee bt pase V from” па of them McCall assumed a 
ways rou 4 
p ч of 50 wich when adee alabie P 
inning with 0 and going 100. Thes? 


5 only 0.3 5 


508 STATISTICAL METHODS 


n 20 
most important of all they were about equal to cach other. А vg cen d 
to 30 is nearly equal to a change from 40 to 50, or from 70 to 80. new éd 
these units are about the best we have. This procedure has ec Ё is an 
ized into a formula. T-score — 50 + [(X — M) /a] 10 in which аР 
individual's score, M is the mean, and c the usual standard p works 
This formula is accurate only when the distribution is normal e cim the 
fairly well even when the original distribution deviates ишы whic 
normal. Let us take our distribution based on 42 cases (Table ving the 
has a mean of 82.1 and a standard deviation of 12.25. Apply! 


1 
formula we get T-score = 50 + [CX — 82.1)/12.25]10. What wou 
the T-score of a person who scored 92? 


8 
T-score — 


25)]10 = 5 
50 + [092 — 82.1)/1225]10 = 50 + [(9.9/12. 
If we take another actual score 58, the lowest case, we get 


T-score = 50 + [(sg — 82.1)/12.25]10 = 31 0 
on of 2 
: most 


ADVANTAGES oF STANDARD ScoRES which 
_The two most Popular procedures for changing raw se score 
differ largely in meaning to equivalent scores are (1) the standar casil 
and (2) the Percentile, The percentile has the advantage of pas P^ in 
understood, A percentile score of 60 means that this is the sape 
100—that 60 per cent of the cases are below the score in question “pelow 


re 
І ` ^ a percentile score is 32, then 32 per cent 27 of 
it and 68 per cent are 


these dis 


STATISTICAL METHODS 509 


distribution, while only a little more than 2 per 

between the second and third standard Dn dan this oe 
allows for that queer arrangement of actual scores known as fence 
distribution. The real distance between the highest 1 per cent рта 
gence, for example, and the next is far more than that between the 49th 
and 50th percentiles. The percentile assumes a rectangular distribution 

as in b (Fig. 39). The percentages above each percentile are assumed to 
be the same all along the base line which simply is not a fact in the usual 


collection of data. 


a 
А E 
о 50 100 
een percentiles and standard scores. 


Fic. 39. Curve showing difference betw 
The percentile works very well between the 25th and the 75th per- 
centile but errs greatly in the extremes. 


THE COEFFICIENT OF CORRELATION 


Thus far we have been speaking of the 
ere secured 


ble. The 42 scores studied w i 
In correlation, on 


statistics involving one varia- 
from a vocabulary test. Each 
individual had just one score. the other hand, in most 
of our work there are always two measures for each subject. The problem 
is to discover the mutual relation, the correlation, between these meas- 
ures. We have been discussing correlation since our study of reliability 
and validity. The index of reliability is usually a correlation between 
two forms of a test, the repetition of the same test, or the odd scores 
against the even scores. It was there indicated that reliabilities above 
p pee T | " able: Correlation might be defined tir average 

were hig ily ( (ЇЇ er di ШОШ fita tests, or two traits in the same 
gree of resemblance which Ё : pis measured twice: 9 msi Бе 
Ferd пу бийди, oe tnd d which have no direct relation 
ized that other facts шаў he correlation between 


toh average rainfall and 

o 

Sto) фета m correlation will usually be computed be- 
P yield. For our purposes: its and there will be a consi 


derable 
We man ra » ы 
"i a BE pw атай n red ог else the coefficient will not be 
er of individuals 


reliable. 


510 STATISTICAL METHODS 


* 
N 
TABLE 23. COMPUTATION oF THE COEFFICIENT OF CORRELATIO 


Word uda М Я á m ji xy 
=. К 
92| 89 9 3 81 9 0 
88| 86 5 0 25 0 392 
24] 214 | ТМ 196 784 216 
95| 10£| 12| 41s 144 324 527 
100| 117} 47 31 289 961 700 
58] 58 | 5s —28 625 784 196 
90| 114 a 358 49 784 209 
94| 105 11 19 121 361 44 
72 82| —1 —4 121 16 —80 
91| 76 8 | _—{10 64 100 0 
83| 102 0 16 0 256 30 
88 92 5 6 25 36 0 
83 65 0| —21 0 441 By 
87 78 4 —8 16 64 mE 
82] 103 =1 17 1 289 120 
78 б aur —24 25 576 190 
64| 76] 19 —10 361 100 360 
68 62 | —1$ —24 225 576 322 
97| 109} 14 23 196 529 108 
95 95 12 9 144 81 ~51 
86 | 69 3} 17 9 289 16 
ОЕ -8 4 64 20 
85 96 9 10 4 100 108 
89 | 194 6 18 36 324 48 
71 % —6| -8 36 64 484 
GL) 025 ~22 484 484 36 
p $2| -9| <4 81 16 612 
58 | —24 | _ 84 13 
Sum (2) 2318 2418 dis ; 200 = D. Уху = 46 
Mean (M) ұз 86 Za? = 3,938 | Dy? = 9, 
PY 
VES узу 
4,613 
Узза 1938 V/9,196 
= 77] 


д нему 
* From Jordan, A, M, Educational удду 3d ed., p. 473. New Yor. 
Holt and Comp any, Inc, » 1942 By Permission, 


— 


STATISTICAL METHODS 511 


PEARSON PRODUCT-MOMENT METHOD 


To Sir Francis Galton is usually given the honor of having first devel- 
oped and used the coefficient of correlation as we know it. It was Karl 
Pearson, of the University of London, who derived for us the mathe- 
matical formula. The formula is 7 = Zxy/Nozc, in which v is the соећ- 
cient of correlation, « and y are deviations from their respective means 
and аге the same as d as we have used it, È is the sum (after the devia- 
tions have been multiplied), V is the number of pairs, c; is the standard 
deviation (S.D.) of one variable, and c is the standard deviation (S.D.) 
of the other. In the definition the term *average" was used. This term 
can be better understood if you note the Л in the denominator. It is well 
to relate this formula a little more closely to what has already been 
learned in statistics. The standard score is (X — M;)/o, in which X isa 
score and M, is a mean. Now X — M: = a, ог d аз we have used it. 
We now get, by substitution, x for X — M, in the formula for the stand- 
ard score, x/oz. In like manner the standard score for y is y/o,. Now by 
multiplying these standard scores together and adding up the ay prod- 
ucts and multiplying the products of the standard deviations by the 
number of pairs, we obtain the coefficient of correlation. ч 

Examine the following аа very carefully both to learn how 

i rstand 1t. 
EU pire X and Y represent individual ш 
(Table 23). The small letters x and у represent deviation eg а 
means, here taken as the nearest whole numbers. M es e, Јо 3 
received 92 on word knowledge (X) and 89 on Miller E me T 
(Y). The mean of the word knowledge scores (M) А 83; os ub Т ler 
(МУ), 86. Small х then is 92 - Беруна are. RE $ р 
TE бр d Poe 0, etc. From now on it is simply a process 
of Жыл in the formula r = Hay Nowe. p we Wed d 
its equal, oz = x*/N and for oy = VZy*/N. Fo 
We now have 
— —— 
r= W NESS/N NZy*/N 


Now, Bay = 4,613, Zu! = 3,938, 29" = 9,196, and V = 28. Virtue 


ri 4,613 


7 = 38 /8,938/28 V9,196/28 


12 STATISTICAL METHODS 
5 


= 1. Therefore 
The 28's cancel out, for 28 V1 /28 x 1/28 = 28/28 


4613 т] 
SS = -766 (or 77) 
"= S938 лото 


- the mean its 
Please note that this is the coefficient when we reri is 2, 
nearest whole number, The mean of the word - s = sinned [В я 
divided by 28 or 82.78 and the mean for the N 1 га this use 0 А 
divided by 28, or 86.36. There are ways of correcting. the difference 
nearest whole number for the mean, but in most cases 


$ 
the mean 
r is negligible. In this case the у when computed from thi 
82.78 and 86.36 is 767. 


е 


SPEARMAN’s RANK-DIFFEREN 

The method of rank 
puting the coefficient o 
can be converted into t 
cient is called rho (p) t 
ences between the two 
often nearer Wie 


CE CORRELATION METHOD 
difierences is 
Í correlation. 
he Pearsonian 
о dis 
coe 


соё!” 

differ 
inguish j Pearson 7. The 5 
tinguish it from the Pears 


anc 
definite 


: zes а 
There are Several occasions when it makes ch 
contribution: f ranks su : 
1. When the Scores themselves аге gathered in the forms 0 c 
as the rankj 


ng of a class for 


For examP™” 
the problem of whet 


is more than 50. 


are 
scores ^ 
Procedure (Table 24) the same т у 


In our illustration of this 
used which 
cedure is to rank the scores j 
ranks and place the diff 
Square these differences 


ormula p = 1 — wa? — 1) te 

A а о 
€ in ranks and y is the number of pairs. 25. You gill н of 
Let us look now at the process exemplified in Table then а P^ mee 
that there are 28 pairs as before (Table 24). We ae е ber” 1): 
our correlation already. The denominator of our be Ф { tbe 
4 — 1) When 28 is substituted for N in the form merator ? 
What now remains to be done is to compute the nu ch 
fraction, 


mar 
in he 
umber$ , "di 
In computing the numerator we first rank the n rank is 
column. In column X the largest number is 100, so 


TABLE 24. COMPUTATION 


STATISTICAL METHODS 


RANK 
Word knowledge, | Miller, Rank | Rank 

X Y Ў Y d* P 
92 89 7 = : - 

88 86 | 11.5 | M н s 
Hd 14 | 25 | 25 

E и 1 451] бе| а ‚ 
100 117 1 1 

58 58 | 28 27.5 Р P 
90 114 9 25 Ee: ait 
94 105 6 5 i : 

72 82 | 23 15.5 7.5 56.55 
91 76 8 20.5 | 12.5 | 156.25 
83 102 | 17.5 9 s TX 
88 92 11.5 12 $ p^ 
83 65 17.5 23 5.5 "T 
87 78 13 18 5 25 

82 103 | 19 8 a Es 

78 62 20 25.5 5.5 30.25 
64 76 | 25 20.5 4.5 20.25 
68 62 | 24 25.5 1.5 po 
97 109 2.5 4 1.5 2.25 
95 gg | 23 | H 6.5 42.25 
86 o | 14 22 8 E: 

85 18 | 15.5 | 18 2.5 6.25 
85 96 | 15.5 10 5.8 30.25 
89 104 | 10 6$ | $5 12.25 
77 73 | 21 18 3 9 

61 64 | 26 24 2 7 

74 82 | 22 15.5 6.5 42.25 

М = 28 
Yd? = sum Чї? = 816.50 
i zd р 
: vs 4,899 


*d 
= difference in ranks. 


816.80) _ 1 — 27554 
6 = 1 — 21,924 


60816.50) 
= 1 — 7980783) 


‚ signs are always plus. 


513 


or rho (p) BY THE METHOD OF SQUARED DIFFERENCES IN 


4 STATISTICAL METHODS 
51 


Each 
го of them. 

in size is 97 you will note there are two o f course, 
t in size is 97, but you will no d iB 
yen right to be ranked 2 and the other w ne сва and gu 
у э 3; What we do is simply take the mean of P and 3 and tha 
en one 2.5. Notice that we have now used ranks 1, 2, 


t 
za 1s 95, bu 
;t in size 15 95, 
the next number will be ranked 4. The number next in sl 

there are two of them; henc 


X 
: тп Х,а 
8 15 the smallest number in colu . Lookin£ 


1, There 


an 
hem uP, 
Sard to signs, Square the d’s, add t 
substitute in the formula. 
SLATION $ 
INTERPRETATION OF COEFFICIENTS ор бөх оп (1) be 
À coefficient of Correlation jg dependent for its e we 
Size, and (2) the Size of the Sample and its representati colleg 
population from which it ұу 


-old 
as drawn, For example, 18-year 


f18-yeu” 
ion О 

Students w Sentative of the tota] population 

olds. 


ould not be repre 


Size of the Coefficient 0. the igne 

he Coefficient is to +1.00 or —1.0 e said e 
the correlation ang the closer the resemblance. We hav e 
reliabilit ficient 


in most Cy 
Sually be .85 or above in aries 170^ 


jon 100, п 
value depends upon its cn criterio 
а coefficient of 23 is computed а bat 0 of 
t related to the other factors in head On aie 
may be profitable to use it, Correlations in the neighbor some cand 
-60 have been called “marked,” “significant,” and sp scores "ns 
tions, “high,” A correlation of .60 between ren ern gu 
Students! marks Would be high. On the other hand, a coefhcie” 
would be definitely low, You see, the interpretation of the 
partly a matte 


relatio” 
eo 

T of magnitude and partly a matter of the typ 

which it expresses, 


STATISTICAL METHODS 515 


Reliability of the Coefficient 


One of the problems which always confronts an investigator is whether 
this correlation which he has computed is representative or not. In 
technical terms is the computed r a true r? It must always be kept in 
mind that the pairs drawn are only a sample of what the total popula- 
tion is. The data with which the two methods of correlation were illus- 
trated were (X) scores on an intelligence test and (Y) scores on a test of 
word knowledge. These were drawn from a college population. The true 
correlation would be that computed from the use of scores secured from 
all college students. Fortunately the correlation between the 28 pairs 
drawn at random give some indication as to what the true 7 would 
be. The formula for the standard error of the Pearson coefficient of 


correlation 155. E. cu — 72)/ NN. — 1. Clearly, its size depends on 
the size of r and the size of N. If N is very large, the fraction is small 
and S.E., is small, a condition which indicates high reliability. If r is 


large and / is large, the 7 is very reliable. If we use our coefficient we 
get S.E., = (1 )/МУ — 1 = (1 — .593)/5.196 = .078 (or .08). We 
may now write 7 = .77 + .08, which when interpreted means: 


1. The chances are 68 in 100 (see page 507) that the true 7 lies be- 
tween .69 and .85. This is 1 standard error limit. 
2. The chances are 95 іп 100 that the true r lies between plus or 


minus 2 S.E.,, or between .61 and .93. 
3. The chances are 99.7 in 100 that the true r lies between plus or 


minus 3 S.E.,, or between .53 and 1.00. 
The numbers 68, 95, and 99.7 are taken from a table which shows the 


percentage of total scores appearing under a graph representing the 
.E.), 2 S.D.s, and 3 S.D.s.' It is thus 


normal curve at 1 S.D. (or here S ind 38 
seen that while the tue’ cannot be calculated, its limits can. 


0585 or THE COEFFICIE. 
The coefficient of correlation is one zi 
Concepts, Not only in the fields of а : 
achieved great statistical prominence, E 
Sociology, and economics, to name a fev „it 
this concept has been useful in four areas: 
3) Prognosis, and (4) test construction. 
Reliability 
" nl I CO! 
In computing the reliability of m ene 
almost universally used. Whether е. 
1 See Garrett, Н. E., Statistics in Psychology and Education, 3d ed. 
rrett, H. Bej psti 947. 
ork: Longmans, Green & Co., Inc ur 


NT Or CORRELATION 
f the most widely used statistical 
cation and psychology has it 
also in the fields of agriculture, 
t has found favor. In testing, 
(1) reliability, (2) validity, 


efficient of correlation is 
lity is computed by the 
, p. 115. New 


516 STATISTICAL METHODS i 
the 
РЕЗ ; " rms of 
repetition of the same test, by the administration of two | The sy™ 
same test, or by the odd-even technique, correlation is use n елй 
bols are usually 71; for repetition, 74» for two forms and т 


even technique (see 
ment tests usually 
slightly lower, and t 
coefficient, the less 


eu f achieve- 
page 28). The reliabilities of batteries "is only 
run .95 or above those of intelligence -sher the 
hose of inventories, about .85 to .92. The e e., the 
variation is there from one form to the other, ү where 
e test. The reliabilities of school а р exten- 
ned, are in the neighborhood of .65 to 099 of a score 
sion of the notion of reliability appears in the standard error 


: : соге. 
have, then, is а sample from which we can predict a true 5 
formula for a Standard 


rom 


f 
" Е сіе 
mount of variation ехрес "оез 


ula it is Clear that thea 
epends on (1) th 


and (2) t 


а single score d 
eing Studied, 
7 were 1.00, th 


mputing the reliabilit. Í 

WO forms Were .96 wi zation o ; 
Sans = 10 VI ge. > М a standard dev 

score of 65, then we could Sày that the chances are 68 in 10 e score 
true score lies between 6 and 67 (1 S.D ), 95 in 100 that the tT true 
lies between 61 and 69 Q S.D D), 


the " 
| in 100 that оп 
score lies between 39 and 71 (35 ON ossi ы E 


bers are secure’ р a 

H I: . 

a table which inq; -D.s). These BUTISS ortions 0 е 

normal curve. [t E the Percentage under different Prom atl 
err as been dem : deviations 

score fall into the onstrated that 


form of the normal curve. 


d and 40 
To return to our S.E. mens, we find that it is easily understoo i? 
excellent measure of reliabilit 


o 
5.0. ей 
hat instead of à ?: ^^ pe 
and a correlation of 96 betw. У. Suppose tha hac ей 


STATISTICAL METHODS 517 


S.E.meas, = 15 N1 — 85 = 15 X .39 = 5.84. Let us round off this 5.8 
and call it 6. Our reliability now becomes 65 + 6. The chances are 68 
in 100 that the true score lies between 59 and 71 (1 S.D.); 95 in 100 that 
it lies between 53 and 77 (2 S.D.s); and more than 99 in 100 that it lies 
between 47 and 83. It is easily seen that, if we have to go as low as 47 
and as high as 83 to get the true score, our sample score is not of much 
value. In the first instance, with an S.D. of 10 and a correlation of .96 

the true score had an extreme variation of 59 to 71. It is clearly gren 
that one needs a high reliability in a test if it is to be of any real use for 
individual diagnosis. 

Another use of the reliability coefficient of correlation is in computing 
the predictive efficiency of a test. Suppose we use the following formula: 
E = (1 — NI — r°) 100. By multiplying by 100 we change the answer 
into percentage of efliciency. Let us take a correlation coefficient and 
substitute it in the formula. Let r equal .80. Then 


Е = (1 — Vi — .64) 100 = 40 per cent efficient 


(see page 32). It is amazing how inefficient our best tests are when 
measured by this accurate formula. Even our best tests are only 68 per 
cent efficient, while those with lower reliability are correspondingly less 


efficient. 
Validity 


The validity of a test is usually obtained by correlating it with some 
criterion which indicates the more certain presence of what the test 
measures, or with other proved tests of the same trait. In our text we 
oned the correlations of intelligence tests with success in life, 
with individual tests such as the Stanford-Binet. The 
Air Force were correlated with success in flying, the 
Minnesota Mechanical Assembly Test with success of junior high 
students in a course in mechanics, and the Minnesota Clerical Test with 
the success of stenographers. During the First World War the scores on 
Army Alpha correlated .50 to .70 with officers’ estimates of the success 
of their men. Finally, inventories of neuroticism have been correlated 
with other inventories and with the presence or absence of neurotic 
symptoms as discovered in a clinic. In a variety of correlations, indica- 
tions are secured which point to the measurement by the test of those 
traits which it is attempting to measure. 

Prognosis 
ought-after outcomes of testing. With 
om the present I.Q. of a child what 
? Will this person who scores high 


have menti 
and of group tests 
tests of the Army 


Prediction is опе of the most з 
what confidence can we predict fr 
T.Q. the child will have 3 years hence 


STATISTICAL METHODS 
518 


: i cess in stenography? Will that 
шыш ze ied A и Air Force iem really get Ex 
Lc Ure the answers are, they are determined by wwe Ө 
One di the questions frequently asked in school is, “Will this ele à 
We ess in studying a foreign language?" In answering this Po Awaits 
ums test of language ability is given to a large group of e 
mos subsequent marks in a language are collected, and then E € the 
tion is computed between the capacity as measured by the tes ed 
success as estimated by the teacher or by an achievement test. the high 
intelligence test prophesy subsequent college marks better than nai te 
school records? In such a case correlations are computed € solis ge 
test scores and college marks and between high school marks an lation: 
marks and an answer given in terms of the coefficient of а 
Thus we say r between high school marks and college marks av 


bout 
about .55 to .60, and between intelligence tests and college marks @ 
.50 to .55. 


he 
The point is that once these relations are determined we can pode а 
scores obtained at an earlier date to predict what persons will iators 
later date. For example, those with top scores in the tests for е lowest 
succeeded in flying in over 80 per cent of the cases, those with the 
Scores succeeded in less than 20 per cent of the cases. 


; gp THE 
SAMPLING—STANDARD ERROR OF THE MEAN AND 
STANDARD DEVIATION 


from those available 100 
mean. It would be clear to 
relation to the whole woul 


STATISTICAL METHOD: 
5 
519 


(2) the size of the standard deviati 
18-year-old college boys could bé кушт КОО к Бане 
and computing the mean. However this AF un 
formula S.E.,4. gives us a clear indication of the шоо à 
the true mean would fall. $.Е.ш = $.D/ NN — 1 S eum cx 
that the mean we computed for 100 cases is 68 inches ae bs of D ó 
eam Then S.E-mean = 2.6/N100 — 1 = .26. We can now say the 
n es are 68 out of 100 that the true mean lies within +1 S.E 7 
: etween 67.74 and 68.26 inches; that the chances are 95 to 5 that "E 
rue mean lies between 67.48 and 68.52; and finally, that the cha: : 
ате more than 99 out of 100 that the true mean lies between 67 зи 
68.78 inches. In like manner is interpreted the standard error of the 


standard deviation 5.Е. = s.D./NXN — 1). 


SUMMARY 

_ р is used in the construction and interpretation of 
К ir application. Scores on tests of any kind are arranged 
in tables of distribution {тот which much can be learned by inspection. 
Measures of central tendency—mean, median, and mode—may then 
be computed. The most important of these is the mean. It, however, is 
greatly influenced py extreme Cases. When these extreme cases are 
accidental or not truly representative the median increases in impor- 
tance. Measures of dispersion or scatter state quantitatively the amount 
of clustering of the scores around the central tendency. The standard 
deviation is the most reliable of the measures of dispersion. T' he semi- 
interquartile range (Q), the average deviation (A.D.), and the probable 
error (P.E.) are 0 es of dispersion. "The standard deviation 
i scores Or standard scores. These scores are 

e based on the true dis- 


s the average degree of re- 
e same group of individuals 
wo procedures for computa- 
and Spearman method of 
this coefficient are legion. 


rank di r m : 
In Miei unten re lidities of tests this coeffi- 
? H + 

cient T ада le. Its substitution I formulas to denote the 
reliability of scores and the efficiency of tests adds greatly to our under- 
standing of these terms: A 

Running through ОШ whole treatment is the 
One measure of an in ;vidual is merely à sample 0 
the true measure. The me 


tribution of scores. 

The coefficient o tion indicate 
semblance found о traits in thi 
When each individual is measured twice- ЛЕ 
tion, the Pearson product-moment method 

e introduced. The uses 0 
Jiabilities; and và 


Е correla 


concept of sampling. 
{ his performance, not 
fany population 


520 STATISTICAL METHODS 


NS rawn 
is just one of the possible means which other samples аай К 
would show. Fortunately, from a single score, coefficient of co within 
measure of central tendency, measure of dispersion, ranges fidence 
which the true score lies may be calculated and the level of Sn inter- 
in each range indicated. No concept in statistics helps more in t^ 
pretation of these scores than that of sampling. 


QUESTIONS AND EXERCISES 
1. Distinguish between the mean, 
median, and mode. Which measure is 


treme cases? Why? 
al scores 


5 
А аге2 
6. In the accompanying — 
pairs of scores: X aa, T 

and Y (socioeconomic level). 


2 mic 
made on a test of word knowledge by Health knowl: aang’) 
college students, the highest possible edge, X level, 

Score being 150: asd een (ee 
1 53 | | 
105 126 103 2 50 | B 
110 94 124 3 48 en 
125 115 65 4 | 50 
9б 112 106 5 49 14 
124 107 131 24 
118 107 126 6 49 | à 
118 114 gg 7 48 at 
16 118 8 52 16 
117 119 9 49 7 
104 139 10 46 "m 
105 96 11 51 | 14 
108 108 12 49 E 
107 119 13 48 
123 122 4 13 
61 112 14 52 19 
116 129 15 47 7 
16 48 11 
Make a table of distribution from these t E г 
data, using а convenient interval of 19 51 1 
5 or 7. Define accurately the beginning M 8 
and end of each step, Make all computa- 20 14 
tions from this table of distribution, 21 45 13 
3. From the above table, compute (a) 22 45 12 
the median and the 40th, 25th, and 23 46 10 
75th percentiles; (b) the mean from the 24 43 13 
assumed mean; (c) the standard devia- 25 45 ИЯ 
боп апа Q. — — ren 
4. Suppose this table were а тергезеп. 4. Compute r (1) Y ай ео 
tative sample of a defined Population: method, (2) by the Spearm toit? 
how would you calculate norms? rank differences. ficient 45 реге 
“5, Compute several T-scores from this b. Interpret this coe (apply 
distribution. Why is it that a T-score is 


Size and as to its reliability 


e accurate than a percentile score? the standard error of у). 
mor 


STATISTICAL METHODS 


c. How does the problem of sam- 
pling enter into your interpretation 
of r? 

7. a. How can the standard error of 
measurement be used to interpret the 
meaning of a score? 

b. Given a reliability coefficient of 
.90, a mean of 50, and an S.D. of 10 


521 


(S.D. the same on each form), if a sub- 
ject scored 27, within what limits would 
his true score lie? State the level of 
confidence in each case. 

8. Given a mean of 63, an S.D. of 10, 
and an N of 121, within what limits 
would the true mean lie? How does 
sampling enter into the interpretation? 


BIBLIOGRAPHY 


Ry E.: Statistics in 
ucation, 3d ed. New 
Green & Co., Inc., 


GARRETT, HEN 
Psychology and Ed 
York: Longmans, 
1947. 

Сопғовр, J. P.: 
lics in Psychology an 


Fundamental Statis- 
4 Education, 2d ed. 


New York: McGraw-Hill Book Com- 


pany, Inc., 1950. 

Watxer, Heren M.: Elementary 
Statistical Methods. New York: Henry 
Holt and Company, Inc., 1943. 


Index 


A 


Ааай, Geneva Р., 247 
^ ie Allan, 173-174 
21 evement-test batteries, 79-93 
evelopment of, 79-82 
evaluation of, 87-90 
geography, 189-191 
ea 146-149 
iterature, 152-156 
mathematics, 226-228 
reading, 96-97 
Science, 250-252 
Social, 186-189 
Spelling, 122-123 
types of, 82-87 
A ists of, 90-92 
Chievement tests, characteris! 
constructing, 40-66 
essay-type questions, 41-43, 51-63 
organization and arrangement, 56-57 
short-answer questions hased onpi 
call, 43-47 
recognition, 47— 
short-answer tests, higher 
, esses, 55 
rene 15-21 
d ins, D. C., 442-446 
ministering of tests, 72 


tics, 9-10 


mental proc- 


Tennistrability of tests, 3 
Dion geometry, prognostic tests of, 
232-233 


Objectives ir i 
л teaching of, 
ge of, 234-237 
rognostic, 241 
pes Mildred M., 39 
en Council Civics an! 
And est, 195 
тазод, Roy N., 287 
nderson, Theresa W., 339 
An Ono WEN 20; 142 
zr Dany M., 274 
eciation of literature, 
Ages 
itude tests, 22, 24 


d Government 


measurement of, 


Aptitude tests, for art, 299-306 
for mechanics, 317-329 
for music, 288-294 
Arithmetic, survey batteries, 226-228 
tests of, 226-232 
diagnostic, 229-232 
separate, 228-229 
Army Alpha and Army Beta, 379-381 
Arthur, Grace, 371, 376 
Arthur’s Point Scale of Performance Tests. 
371 D 
Arts, achievement, 306-307 
capacity in, 299-306 
measurement of, 298-307 
objectives in the teaching of, 208-299 
Ashbaugh, E. J., 124, 142 
lity (test), 475-477 


Aspects of persona 
construction and scoring, 476-477 


three dimensions of, 475 
Attitude scale, construction of, 451 
‘Attitudes, changes in, 4 

definition of, 447-448 

description of, 

learning of, 448-449 

measurement of, 450-460 

in social sciences, tests of, 201-202 


uses of scales, 4 
Ayres, L. P» 123, 142, 143 
Ayres Measuring Scale for Handwriting, 


130-131 
Ayres Spelling Scale, 123-124 


B 


Babcock, Harriet, 334 
Ball, Rachel S., 491, 494-495 


Bare, T. Н., 
Barrett, Dorothy М., 287 
Barrett Ryan-Schrammel English Test, 158 
Batteries of fundamentals, 83-8 
Becker, Ida S., 246 
Beliefs оп social issues, test of, 455-456 
Bell, Hugh М., 471, 494 
471-473 


Bell Adjustment Inventory, 
divisions of, 


523 


524 


Bell Adjustment Inventory, interpretation 
of, 472 
validity of, 472-473 
Bennett, George K., 334 
Bernreuter, Robert G., 376 
Bernreuter Personality Inventory, 468-471 
nature and construction, 468-469 
scoring of, 469 
validity of, 470-471 
Betts, E. А., 141 
Binet, Alfred, 10, 354 
Bingham, Walter Van 
316, 334 
Biology tests, 255-258 
Bixler, Harold E., 161 
Bixler High School Spelling Test, 161 
Blackstone, E. G., 287 


Blaisdell Instructional Tests in Biology, 
262 


Dyke, 39, 274, 287, 


Bloom, Benjamin S., 39 
Bogardus, E. L., 464 
Bogardus Scale of Social Distance, 453 


Bookkeeping tests, 280-284 
list of, 283-284 


Business Entrance Tests, 
282-283 
Bookwalter, Kar! 


1 W., 350 
Bovard, John F, 


» Clifford Lee, 340, 349 
A., 232 
rownell’s Posture Silhouette 
tuner, Herbert B., 463 
3 ton A., 223 
K., 69, 205, 223, 233, 246 
19, 494 


Scale, 340 


, 283, 


ent tests, 2 
list of, 286 84-285 
тыша education, measurement op 273_ 
objectives in, 273 
tests, bookkeeping, 280-284 
clerical, 274-280 
content, 284—286 
Business Fundamentals and Genera] 
mation Test of United.N. 
ness Entrance Tests, 285 
Buswell, G. T., 246 | 
Buswell-John Diagnostic Тез 
mental Processes in Ari 
231 


t for Funda. 
thmetic, 230- 


INDEX 


[e 


5-89, 94, 
California Achievement err 8 
122, 149, 227-228, s ^ Occupa 
California Aptitude Tests 926-328 
(Roeder and Graham), “Test (Stolz), 
California Group Functional 
338 396 
California Intelligence Test, а 5 
California Test of Personality; 
dimensions of, 473-474 473 
inventories for all grades, 
validity of, 474-415 
Canning, L. B., 5 
СЫ н ^ ena 
Carey, Stephen M., 333 А 
Carroll, Bacher A., 175, ae ian 75-178 
Carroll Prose Appreciation 
Carter, H. D., 446 
Carter, Ralph C., 59, 65 
Cattell, J. McKeen, 353 
Chase, Stuart, 273 
Chave, E. J., 451, 464 
Chemistry tests, 258-260 
Cheydleur, F. D., 223 
Civics tests, 195 G 
Civilian occupations, А 


tions 


1 
CT scores, 4 


416 " 
Clark, R. S., 4 
Clark, Willis W., 94, 142 -57 


Classroom tests, conato ү 57-63 
essay-type questions, " 
higher mental processes, Sent, 56 5 
organization and arrange 
matching, 52-55 Ре 
multiple-choice, KAD а 
sentence-completion, 3-55 

short-answer | ons; 
true-or-false, 49-. 6 

Cleeton, Glen U., “oo 

Cleeton Vocational Ir 5 

430-431 - 216-28 

Clerical achievement pour r 

Clerical content tests, 23 216-21 

Clerical tests, achievement, 
aptitudes, 274-276 

Cole, Luella, 200 


/ 
1 


Invent 


+ ene 
:ntelli£ 
and in 


College success prediction Ж est 
tests, 410-412 reau Alge , 
Columbia Research Bu m 


219 ңе, 

234-235 . test, 27# 
Commercial education suf, rith 
Compass Diagnostic 

229-230 А 
Complete batteries 

82-83 
Conard, Edith U., 143 


[a 
tes 

pievement 
of ac 


INDEX 


Conard Manuscript Writing Standards, 
131-133 
Concepts used in social sciences (Pressey 
test), 200-201 
Cook, Walter W., 403-404 
Cooke, Dennis H., 246 
Cooperative Algebra Test, 235-237 
Cooperative American History Test, 192- 
193 
Cooperative Biology Test, 255-257 
Cooperative Chemistry Test, 258-260 
Cooperative Economics Test, 194 
Cooperative English Test, effectiveness of 
expression, 162-163 
mechanics of expression, 158 
organization, 163-164 
reading comprehension, 167-170 
spelling, 162 
Cooperative French Test, 209-210 
Cooperative General Achievement Tests, 
196-197 
Cooperative German Test, 214-216 
Cooperative Latin Test, 217-219 
Чопро бод Literary Acquaintance Test, 
Cooperative Mathematics Test for Grades 
7, 8, and 9, 228-229 
Cooperative Modern European History 


Test, 193 
Cooperative Physics Test, 260-261 
Cooperative Plane Geometry Test, 238-240 
Test for Grades 7, 8, 
udies Test for Grades 
7, 8, and 9, 189, 195-196 


Cooperative Spanish Tests, 211-213 
Coordinated Scales of Attainment, 78, 94, 
153, 186-187, 228, 250-251 
Correlation, coefficient of, 9-518 
int ation of, 514-515 
erpretation ol, diui; ca 


Pearson product-moment me 
512 
Spearman rank-differem 
514 
uses of, 515-518 
Courtis, S. A., 14 
Courtis Research Tests 
Cozens, Frederick W., 33 
Crawford, John Edmund, 324 
Cronbach, Lee, Jr. 15 15, 26, 39, 65, 955 
419, 494 
Cruickshank, Ruth M., 334 
Cubberley, Hazel J., 342, 349 
Cureton, Thomas K., 350 
Cureton, Thomas K., Jr» 350 
Curtis, Dwight K., 271 


ce method, 512- 


in Arithmetic, 15 
5, 340, 342, 349, 350 


525 


Darley, J. G., 494 

Dashiell, J. F., 448 

Daugherty, M. L., 143 

Davis, G., 142 

Davis, H., 404, 419 

Davis, Ira C., 271 

Detroit Mechanical Aptitude Examination 
for Girls, 318 

Dewey, B., 323 


Dewey, John, 424 
Diagnostic Test for Fundamental Processes 


in Arithmetic (Buswell and John), 


230-231 

Diamond, Leon N., 271 

Dickson, V. E., 404 

Differential Aptitude Tests, 328-329 

Drake, Raleigh M., 333 

Duran, June C., 334 

Durrell, Donald D., 141, 419 

Durrell Analysis of Reading Difficulty, 
115-117 


E 


Economics tests, 194 

Economy, 37 

Edgren, H. D., 350 

Educational guidance and intelligence tests, 
407—409 

E.R.C. Stenographic Aptitude Test, 275- 
276 

Eldridge, R. C., 120, 142 

Ellingson, Mark, 435, 446 

Elliott, Edward C., 3, 39, 43 

Ellis, Albert, 482, 494 

Emerson, Marion Rines, 334 

Engle-Stenquist Home Economics Test, 
312-313 

English composition, scales 
Hillegas, 165 
Hudelson, 165-167 
Lewis, 165 
Nassau County, 165 
Van Wagenen, 165 

English-usage tests, 158-160 

English vocabulary tests, Cooperative, 171 


Inglis, 171 
Equal-appearing units, 451 
Espenchade, Anna, 350 
Essay examination, weakness of, 3 
Essay-type questions or examinations, 
43, 57-63 
causes of unre 
improvement, 0 
of scoring, 61-63 
value of, 58-59 


of, 164-167 


41- 


liability, 41-43 
f questions, 59-60 


526 


Examination, in bookkeeping and account- 
ing, 280-282 
in plane geometry, 238 


P 


nsworth, P. R., 289, 292, 333 
A Richard, 322 
Faulkner, Ray, 333 
Feder, Daniel D., 446 
Feebleminded, interest in, and intelligence 
tests, 354-355 
Filer and O'Rourke Rating Scales, 485-486, 
494 
Filing test, United-NOMA Business En- 
trance Tests, 284 
Fine arts and manual arts, measurement of, 
288-334 
arts, fine, 208-307 
manual, 307-329 
music, 288-298 
objectives, 295, 208-299, 308-309 
Flanagan, J. C., 271, 468, 469, 494 
Foran, Thomas G., 94, 142 
Foreign languages, measurement of, 207- 
224 
objectives in teaching, 207-208 
tests, French, 208-211 
German, 214-217 
Latin, 217-220 
Spanish, 211-214 
Forrest, Ruth, 377 
Foster, J. G., 376 
Foster, Josephine C., 376 
Fransden, Arden, 446 
Franseen Diagnostic Tests 
151-152 
Freeman, F. N., 135, 143, 361, 376 
Freeman Chart for Diagnosi 
Handwriting, 134-135 
French, Esther, 343, 350 
French tests, 208-211 
lists of, 210-211 
керу ОЁ occurrence, check on validity, 
Froehlich, Gustav J., 411, 419 
Frutchey, Fred P., 271 
Fryer, Douglas, 419, 438, 445 


in Language, 


ng Faults in 


G 


Gage, N. L., 13, 39, 66, 182, 446, 447, 453, 
463 

Galton, Sir Francis, 511 

Gardner, Iva Cox, 461, 464 

Garretson, O. K., 441 

Garrett, Henry E., 28, 29, 39, 369, 515, 521 

Gates, A. I., 125, 141, 142, 344 


INDEX 


Gates Tests of Reading, 104-106, 00-10 
Gates-Strang Health Knowledge Test, 
345 
General science, tests of, 252-255 
Geography tests, 189-191 
Geometry tests, 237-240 
German tests, 214-217 
list of, 216-217 03, 
Gerberich, J. Raymond, 93, 141, 181, 2 
223, 246, 271, 287 
Gist, A. S., 95 í 
Glenn-Greenberg ^ ÉD 
General Science, 2 s: 5- 
Glenn-Obourn Instructional Tests in Phy 
ics, 262 Е m- 
Glenn-Welton Instructional Tests 11 giie 
istry, 262 
Goddard, Eunice R., 224 
Goddard, Henry H., 10, 354 
Goodenough, Florence L., 13, 376 » Scale, 
Goodenough “Drawing a Man 
371-372 
Goodman, Charles H., 376 
Grade equivalent, 81-82 
Gray, C. T., 137, 143 
Standard Score Card for 
Handwriting, 137 
Gray, Н. A., 271 
Gray, W. S., 142 
Gray's Oral Reading Test, 106-108 
individual record sheet, 118 1 jevement 
Gray-Votaw-Rogers General Ach 
Tests, 87, 94 
Greene, Edward B., 333, 445, 494 
Greene, Harry A., 93, 141, 181, 
246, 271, 287 ia 
Grice Generalized Attitude Scale, 
Grover, C. C., 246 
Guidance measurement, 9 
Guilford, J. P., 39, 494, 521 
Guilford, R. B., 494 
Guttman, L., 39 


Tests 10 


Measuring 


205, 225 


454 


H ing 


‚г Rat 
Haggerty-Olson-Wickman Веһауш) 
Schedules, 11, 484-485, 488 349 
Hagman, E. Patricia, 335, 340, 
Handwriting, 127-138 j 
aims and objectives in teaching, 
diagnosis of, 134-136 
measurement of, 128-134 
practice exercises, 136-138 
Handwriting Scale (E. L. Tho 
Handwriting score card, 137 
Harrell, Willard, 322 
Harrison, M. Lucile, 141 
Hartley, Eugene L., 456, 457 


1271 


rndike)» 6 


527 


INDEX 


Hartley Picture Attitude Test toward Ne- 


groes, 456-459 
Hartog, Sir Philip, 4, 41 
Harvard Step Test, 337 
Hathaway, S. R., 470, 494 
Hauch, Edward F., 223 
Hawkes, Herbert E., 65, 181, 223, 
Health education, list of tests in, 
Health information tests, 343-345 
Health Inventory for High School Students 

(Neher), 345-347 
Health practices, 345-347 
Henmon, V. A. C., 221, 223 
Herring, John P., 376 
Hesler, Russell J., 287 
Hiett Stenography Test, 277 
Higher mental processes, tests of, 55 
Highsmith, J. A., 333 
Hildreth, Gertrude, 94, 142, 371, 376 
Hillbrand, E. K., 295 
Hillegas Scale for Measurement of Quality 

, mn English Composition, 
Hinckley, E. D., 464 
Hinckley Attitude Scale toward the Negro, 

454 
History tests, 192-194 
Hoff, A. G., 271 
Home economics, 

rating scales an 
tests for high sch 
дешаи tests, 3 
omogeneous grouping 

tests, 409-410 
Horn, Ernest, 17, 39, 120, 142 
Rune E. Porter, 449, 4 
Horning, 5, D, 322, 334 
Horowitz, E. L., 461 
ш Amy R., 350 
H yer, Louis P., 443 

ubbard, В. M., 442, 446 
‘Udelson, Earl, 161 


measurement in, 312-316 


d check lists, 
ool, 313-315 


1 
and intelligence 


Ability 


Hudelson Typical Composition 
ee 10010 

r E М 

„Б.С, 459, 483 на, 459 


ui P 
nter Test of Social Attit 


I 


I i е ә 
“diana Tests of Home Economics e 


Indiy; 
in paua] differences, 353-354 
.Q. (етае in same grado 
aa telligence quotient. , 58 
racteristics of, 3 361 E reading 


C 
lligence tests, and bes? 
405 J 


40. 


а 
Nd definition of feeblemindeds 4 


Intelligence tests, and election of high- 
school subjects, 405-407 
group, 378-419 
development of, 378-381 
types of, 381-390 
for grades 1 through 3, 391-396 
for grades 4 through 8, 396-400 
for high school, 400—403 
for kindergarten and first grade, 390— 
391 
uses of, 403-418 
individual, 353-377 
description of, 355-368 
development of, 353-355 
neral nature of, 18 
and meaning of intelligence, 372-374 
rformance tests, 368-372 


validity of, 21-24 
Interest and achievement, 441-442 
Interest measurement, 423-446 
Interests, characteristics of, 423-424 
correlation of, with achievement, 439 
mation, 435-439 


through infor 
tests of, validity of, 438-439 


inventories, 26-435 
list of, 436-437 
uses of, 
methods of discovering, 424-426 
its, 441-444 


ther traits, 


d comparability of tests, 


in relation to 0 
Interpretation an 


35-37 
Interpreting and using results of tests, 73-79 
Towa Every-pupil Tests of Basic Skills, 84- 
85, 97, 148, 190-191 
Iowa Language Abilities Test, 149-151 
Iowa Silent Reading Tests, 17, 111-113 
Iowa spelling Scales, 1 
i 217 


J 
rina eee 446, 470, 494 
hn, Lenore, 
Johns, A., 470, 494 
ohnson, B., 333 $ 
nes; ranklin, 120, ^s 
onn, Arthur Moa iy 9. 267, 406-408 
Jordan, “423-424, 434, 44% 478, 486, 510 
Jorgensen ]bert , 141, 181, 205, 223. 
246, 271,287 | М bo 
dgment f experienced observers, 
La on У айу of, aH 
Jurgensen Clifford, „2 
Е 
es, ay, 6 
Карочї , Peter V., 350 
Katz, 5. E. 39 


528 INDEX 


Kaulfers, Walter l.c" 
er, Grayson N., 
re ке L., 31, 39, 58, 63, 65, 80, 94, 
184, 206 m 
a B. | 
os Test of Concepts in the Social 
Studies, 200 
Keniston, Hayward, 223 
Kent, Grace H., 361, 376 
Kilby, Richard W., 142 
King, W. A., 95 
Kintner, Madaline, 334 
Klugman, Samuel F., 287 
Knauber, Almer Jordan, 306, 334 
Knauber Art Ability Test, 306-307 
Knuth, William E., 333 
Koos, L. V., 128, 419 
Kopel, David, 142 
Kornhouser, A. W., 441, 446 
Krey, A. C., 58, 63, 65, 184, 206 
Krugman, M., 362 
Kuder, G. Frederic, 25, 29, 39, 425, 446 
Kuder Preference Record, 431-433 
Kuhlmann, F., 376 
Kuhlmann-Anderson 
384-387 
reliability of, 386-387 
validity of, 385-386 
Kwalwasser Test of Musical Information 
and Appreciation, 297-298 
Kwalwasser-Dykema Music Tests, 292-293 
Kwalwasser-Ruch Test of Musical Accom- 
plishment, 296 


Intelligence Tests, 


L 


Landis, Carney, 39, 470, 494 
Language, aims and ob 
144-145 


and literature, measurement of, 144-182 
lists of tests in,el 


ementary schools, 155 
Secondary schools, 179 


tests in, elementary Schools, 145-152 
secondary Schools, 156-172 
written, tests in, 145-152 
diagnostic, 151-152 
separate, 149-151 
(See also Literature) 
La Porte, William L., 335 
Larson, Leonard, 350 
Latin tests, 217-220 
list of, 220 
Leamer, Emery W., 143 
Lectures, effect on attitudes, 461 
Lee, Doris May, 142, 247 
Lee, Edwin A., 433 
Lee, J. Murray, 142, 247 


jectives of teaching, 


Lee-Thorpe Occupational Interest Inven- 
tory, 433-434 

Lenz, Theodore F., 463 

Leonard, Ruth, 322, 334 

Lewerenz, Alfred S., 304, 334, 463 — 

Lewerenz Tests in Fundamental Abilities О 
Visual Arts, 304-306 

Lewis English Composition Scales, 165 

Lide, Edwin S., 238, 246 

Likert, Rensis, 463, 464 

Lind, Christine, 94 

Linden, Arthur V., 463 

Lindgren, Henry C., 434, 446 

Lindquist, E. F., 13, 65, 181, 206, 223, 246, 
271 

Literary acquaintance (secondary school); 
tests of, 178-180 

Literary appreciation, tests of, 172-178 

Literature, and language (scc Language) 

tests of, elementary schools, 152-156 

secondary schools, 172-180 

Logassa, Hannah, 182 

Longstafi, Howard P., 274 

Loutit, C. M., 480 

Loyes, Edmund, 94 


M 


McAdory, Margaret, 301 
McAdory Art Test, 301-304 
MacBroom, Maud, 17, 39 
McCall, William A., 7, 507 
McCall, William C., 446 
McCloy, C. H., 341, 350 
McCoy, Martha J., 182 
McHale, Kathryn, 438, 446 Col- 
McHale Vocational Interest Test for 
lege Women, 438 А {= 
Machine calculation, United-NOMA Bus 
ness Entrance Tests, 284 
Масад Donald, 377 
acQuarrie, Т. W., 321 m 
acQuarrie Tests for Mechanical Ability» 
318, 320-323 
Madsen, I. N., 405, 419 
Maller Case Inventory, 477-478 
Construction and validity, 477-478 
types of scores, 477 
Mann, C. R., 65, 181, 223, 271 
Manual arts, 307-310 
objectives in teaching of, 308-309 
tests of, 309-310 
Manuel, Н. T, 25 
Matching tests, 52-55 1 
athematics, measurement of, 225-257 933 
Objectives in teaching of, algebra, 23 
arithmetic, 225-226 


INDEX 


Mathematics, objectives in teaching of, 
geometry, 238 
tests, in elementary schools, 225-232 
list of, 242-245 
in secondary schools, 232-241 
Maurer, Katharine M., 21 
Mean, arithmetic average, 503-505 
Mean deviation, 507 
Measurement of intelligence (sce Intelli- 
gence tests) 
Measuring of mental traits, difficulties in, 
4-7 
Measuring instruments, administrability, 
34-35 
characteristics of, 14-39 
economy, 37 
interpretation and comparability, 35-37 
reliability, 26-33 
validity, 14-26 
Mechanical-ability tests, assembly and per- 
formance, 318-323 
information, 317-318 
paper-and-pencil, 323-329 
Mechanical aptitude and ability, testing 
procedures, 316-329 
information about mechanical ability, 
317-318 
mechanical assembly tests, 318-323 
paper-and-pencil tests, 323-329 
processes analyzed into elements, 317 
Mechanical Aptitude Test of United States 
Army, 438 
Mechanical interest test, 438 
Median, 501-502 
Meier, Norman C., 334 
Meier-Seashore Art Judgment Test, 299 
301 
Mellenbruch, Paul L., 325 
Mellenbruch Mechanical Aptitude Test for 
Men and Women, 324-326 
Mental-age scales, 355-363 
Merrill, Maude A., 39, 357, 359, 363, 377, 384 
Metropolitan Achievement Tests, 82-83, 
91, 94, 98-103, 122, 146-147, 153-154, 
187-189, 226-228, 250-252 
Metropolitan Reading Readiness Test, 98- 
100, 103 
Micheels, W. J., 65 
Michigan Pulse Rate Test for Physical Fit- 
_ hess, 337 
ien Augustus T., 337, 350 
MS Ralph D., 459, 460, 464 
пага Test of Racial Attitudes, 459-460 
Innescta Check List for Food Preparation 
_ (Brown), 314-315 
и ы Food Score Cards (Brown), 315- 


529 


MP a Mechanical Assembly Test, 318- 

92 

Minnesota Paper Form Board Test, Re- 
vised, 323-324 

Minnesota Vocational Test for Clerical 
Workers, 274-275 

Mitchell, Mildred B., 377 

Mode, 505 

Monroe, Marion, 141 

Monroe, Walter S., 59, 65, 419 

Moody, Caesar B., 389-390 

Morehouse, Lawrence E., 337, 350 

Morgan, B. Q., 223 

Morgan, W. J., 334 

Morrison-McCall Spelling Scale, 124-125 

Morrow, Robert S., 287 

Mosher, Raymond M., 295 

Mosher Test of Individual Singing, 295-296 

Motor coordination tests, 341 

Moving pictures, effect on attitudes, 461 

Multiple-choice tests, construction of, 47-49 

Murphy, Gardner, 463 

Mursell, James L., 290, 333 

Music tests, 288-298 

objectives of, 295 
Musical aptitude, measurement of, 288-294 


Musical Aptitude Test (Whistler and 
Thorpe), 293-294 
Musical information, appreciation, and 


achievement, 294-298 
N 


Nash-Van Duzee Industrial Arts Tests, 
309-310 

Neher, Gerwin Charles, 345 

Neilson, N. P., 342, 349 

Netzer, Royal F., 146 

Newcomb, T. M., 463 

Newkirk, Louis V., 311, 334 

Newkirk-Stoddard Home Mechanics Test, 
310-311 

Newman, Horatio H., 360 

Noll, Victor H., 271 

Norms, local, 36 

Noyes, E. S., 62, 66 


[9] 


Objectives in education, 4 
Odell, C. W., 182, 223, 246, 271 
Oral English, 145-146 
Orleans, Jacob S., 66 
Orleans, Joseph B., 247 | 
Orleans Algebra Prognosis 
Organization and arrangem 
57 


Test, 241 
ent of tests, 56- 


530 INDEX 


O'Rourke Mechanical Aptitude Test, 323, 
332 
Otis, Arthur S., 380 


P 


Paterson, Donald G., 287, 319, 320, 334 
Pearson, John M., 246 
Pearson, Karl, 511 
Percentiles, 501-503 
Performance tests of intelligence, 368-372 
Perry, Fay V., 334 
Perry, Winona M., 247 
Personality inventories, measurement, of 
attitudes, 447-464 
of interest, 423-446 
of personality traits, 465-495 
Personality rating scale for preschool chil- 
dren, 491 
Personality traits, measurement of, 465-495 
rating scales, 483-491 
self-inventories, difficulties with, 466-468 
types of, 468-482 


validity of personality inventories, 482— 
483 


Peters, Emma, 223 
Peters, F., 464 
Peterson, Joseph, 376 
Peterson, Ruth, 461, 463 
Physical education, achievement tests, 342- 
343 
and health, measurement of, 335-350 
objectives in, 335-336 
rating scales, 348 
tests, of health information, 343-348 
of physical capacities, 336-342 
Physics tests, 260-262 
Pintner, Rudolph, 355, 371, 376, 409, 412, 
419 
Pintner General Ability Tests, 381-383 
Pintner Intelligence Tests, 
Advanced, 383 
Pintner-Cunningham Primar 
Pintner-Durost Elementar 
Test, 383, 392-393 
Pintner-Paterson Scale of 
Tests, 369-371 
Piper, A. H., 247 
Plan for testing program, 70-71 
Poetry, exercises in judging, 173-174 
Point scales of intelligence, 363-368 
Pooley, T 182 
eus, D ^» 
porte" n 405, 419 
P C., 143 
теѕвеу, ү” 43 " 
ау, е Tests in English Сот- 


Na ion, 159-160 


Intermediate, 


y Test, 382 
у Intelligence 


Performance 


Pressey Test of Concepts Used in the Social 
Sciences, 200-201 
Price, Roy A., 206 
Primary mental abilities, 387-390 
Probable error, 506 Р 
Problems, skills, and procedures of testing 
in social science, 195-199 
Proctor, W. M., 407-408, 419 
Prognostic tests, 219-220 + 
Luria-Orleans Modern Language Prog 
nosis Test, 219-220 — 
Orleans-Soloman Latin Prognostic Test. 
219 м 
Symonds Foreign Language Prognos!'* 
Test, 219 А кой 
Psychological and logical analysis, chec 
validity of, 18-21 
Pullias, Earl V., 94 
Pyle, William H., 142 


Q 


Q, semi-interquartile range, 506-507 
R 


Racial attitudes, measurement of, 45643 
Rating scales, 11, 483-491 
list of, 492 
samples of, 488-491 
types of, 484-488 
Read, James Morgan, 206 
Randig, 95-117 96 
objectives in teaching of, 95- 
spelling, and EE measurement 
of, 95-142 
tests of, in achievement batteries, 96-97 
diagnostic, 114-119 
oral, 117 £y 
reading achievement, element 
School, 103-113 
high school, 113-114 
reading readiness, 97—103 
Thorndike-McCall, 7 
Reading comprehension (secondary 5С 
167-170 
Reading diagnosis, tests of, 114-119 ent, 
Reading tests, lists of, reading achieve™ 
113 


hool); 


reading diagnosis, 119 
reading readiness, 101-102 
Ream Social Relation Test, 438 
Reliability, 26-33 
factors affecting, 29-32 
meg ietation, 32-33 1-20 
methods for computing, 27- n 
Remmers, H-H; B. 39, 66, 182, 446, е 
453, 454, 463, 464 


INDEX 


Rhodes, E. C., 4, 41 
Richardson, M. W., 29, 39 
Rigg, Melvin G., 174, 182 
Measuring the Ability to Judge Poetry, 
174-175 
Rinsland, Henry D., 49, 53, 61 
Roberts, Catharine Ellis, 491, 494—495 
Roeber, Edward C., 434, 446 
Rogers, Frederick Rand, 339, 350 
Rogers Strength Test, 339 
Rogers Test of Personality Adjustment, 
479-481 
divisions of, 479 
types of tests, 479-480 
usefulness of, 480 
Rosanna, M., 464 
Ross, C. C., 13, 39, 43, 66, 182 
Ruch, G. M., 66, 80, 94, 223 
Ruch-Cossman Biology Test, 257-258 
Ruch-Popenoe General Science Test, 254— 
255 
Russell, D. H., 125, 142 


5 


Scates, Douglas E., 39 
Schlink, F. J., 273 
Schneider, E. C., 336, 350 
Schneider Test of Pulse Rate and Blood 
Pressure, 336-337 
Schneidler, Gwendolen G., 287 
Schnell, Leroy N., 237 
Schoen, Max, 333 
Science, measurement of, 248-272 
aims and objectives, 248-249 
attitudes and interests, 266-267 
scientific thinking, 263-266 
tests, in elementary schools, 249-255 
.  àn secondary schools, 255-262 
Science tests, instructional, 262 
list of, 268-270 
9f understanding, 263-266 
Scientific attitudes and interests, 266-267 
Scientific thinking, 263-266 
Score cards in handwriting, 135-137 
Scoring tests, 72-73 
Scott, M. Gladys, 343, 350 
Seagoe, May V., 247 
Sealey; Glenn A., 66 
seashore, Carl Emil, 288, 333 
ashore’s Measures of Musical Talent, 


is 289-292 
еер, Louise b 24 


ELE SCIT 
үле, disci subjects and intel- 
digg’ tories, 466-483 
culties in use of, 466-468 


551 


Self-inventories, list of, 481 
types of, 468-482 
validity of, 482-483 
Sentence-completion tests, construction of, 
46-47 
Sentence-organization tests, 163-164 
Sentence-structure tests, 162-163 
Shanner, W. M., 25 
Sharpe, S. E., 377 
Short-answer tests, 43-55 
based on, recall, 43-47 
recognition, 47-55 
Shotwell, Anna M., 94, 141, 404 
Siceloff, Margaret McAdory, 301 
Simmons, Ernest P., 161 
Simple recall tests, construction of, 44-46 
Sims, Verner, 61, 66 
Sixteen spelling scales, 161 
Smith, Dora V., 144, 182 
Smith, Eugene R., 13, 18-21, 39, 55, 66, 173, 
182, 206, 263-264, 272, 446, 450, 455, 
463 
Smith, F. T., 463 
Social-science tests, list of, 203-205 
Social sciences, measurement of, 183-206 
objectives in teaching of, 184-186 
tests of social studies, elementary 
schools, 186-191 
secondary schools, 191-202 
Social terms, tests of, 199-201 
Social utility, check on validity of, 18 
Spache, George, 94 
Spanish tests, 211-214 
list of, 213-214 
Spearman, Carl, 368, 512 
Speer, С. S., 470, 495 
Spelling, 117-128 
objectives in teaching of, 121 
selection of word lists, 117-121 
tests of, elementary school, 121-125 
list of, 125 
secondary school, 160-162 
uses of, 125-127 
Spencer, Douglas, 425 
Stalnaker, John M., 62, 66 
Standard deviation, 505-506 
uses of, 507-508 
Standard error, of the mean, 518-519 
of measurement, 33 
of the standard deviation, 519 


Standard score, 81, 507 
Stanford Achievement Test, 82-83, 94, 122, 


147-148, [4 " 3, 250-251 


nsbury, Edgar, 33 
Шш b ed 291, 333 


Starch, Daniel, 3, 39, 43 
Starch, David, 120 


532 


istical methods, 499-521 
wee of data, 500-502 
central tendency, 502-505 
concepts, 499-500 
correlation, 509-515 
dispersion, 505-509 
sampling, 518-519 _ 
uses of coefficient of correlation, 515-518 
Steadman, Robert F., 206 
Steinmetz, Harry C., 463 
Stenographic test, United-NOMA Business 
Entrance Tests, 277-278 
Stenographic tests, 275-278 
achievement, 276-278 
aptitude, 275-276 
Stenography and typewriting, list of tests 
in, 279-280 
Stern, Wilhelm, 372 
Stetson, F. L., 161 
Stewart, Naomi, 419 
Stoddard, George D., 223 
Stogdill, Emily, 470, 495 
Stolz, H. R., 338 
Stone, Clarence R., 142 
Stoy, E. G., 322, 334 
Strang, Ruth, 344, 350 
Strength tests, 338-341 
Strong, Edward K., 429, 446, 484 
Strong Vocational Interest Blank, 426-429 
history of, 426 
Scoring, 427 
types of items, 427 
validation, 428-429 
Stutsman, Rachel, 376 
Super, Donald E., 13, 25, 442, 446, 494, 495 
Symonds, P. M., 182, 224, 246, 247, 441, 494 


T 


T-score, advantages of, 81, 508-509 

Taylor, Katherine Van F., 446 

Teacher-made tests, 
56-57 

Terman, Lewis M., 23, 39 
357, 363, 376, 384 

Terman-Merrill Revision, 355-363 
evaluation, 361-363 
1.Q., 359-361 
mental age, 358-359 
principles of construction, 356-358 

Test of Critical Thinking in the Social 

Studies Lue m ed 197-199 
{ ram, 67- 

Teal T ering and scoring of tests, 72-73 
interpretation of results, 73-79 
planning, 67-72 Е 72 

Tests, administering 0%, 


organization of items, 


, 80, 94, 354, 355, 


INDEX 


Tests, administrability of, 34-35 | 
of foreign languages, evaluation of results, 
221-222 ip 
interpretation of results, 73-19 s 
teacher-made, organization of items, 
57 " 
(See also specific names and subjects of 
tests) 
Thinking, methods of testing, 19-21 
Thomas, Minnie E., 470, 495 " 5 
Thorndike, Edward L., 120, 130, 133, 142, 
143, 376, 405 — 
Thorndike Scale for Handwriting of Chil- 
dren, 130 
Thorpe, Louis P., 433 _ 
Thurstone, L. L., 376, 388, 451, 452, 461. 
463, 464 
Thurstone, Thelma Gwynn, 388 А 
Thurstone Attitude toward Communism 
Scale, 452 
Tidyman, W. F., 142 
Tiegs, Ernest W., 94, 141 
Tiffin, Joseph, 334 
Tonne, Herbert A., 273, 287 
Toops General Interest Test for Girls, 438 
Torgerson, T. L., 247 
Townsend, Agatha, 141, 206 
Trabue, M. R., 16, 173-174 — 
Trabue's Nassau County Scale of Englis! 
Composition, 165 
Travers, Robert M. W., 66 59 
Traxler, Arthur E., 94, 182, 206, 224, 259. 
446 
Trieb, Martin H., 342, 349 
Triggs, Frances Oralind, 433, 441, 446 
True-or-false tests, 49-52 
Turney, Austin H., 387 
Turse, Paul L., 287 
Turse-Durost Shorthand Achievement Test, 
216-217 3 
Tyler, Ralph W., 13, 18-21, 39, 55, 66, 175, 
182, 206, 263-264, 272, 446, 450, 49? 
463 


Typing achievement tests, 278-280 


U 
United-NOMA Business Entrance Tests, 
284-285 
Units of measurement in education, 7-9 


у 


Validity of tests, 14-26 
external, 22-24 
internal, 15-22 
recent trends in, 24-25 


INDEX 


Validity of tests, vitiating factors in, 25-26 

Van Alstyne, Dorothy, 486, 495 

Van Alstyne Rating Scales, 486 

Vander Веке, George E., 223 

Van Wagenen, M. J., 376 

Van Wagenen English Composition Scales, 
165 

Vocabulary-load-of-interest inventories, 
434—435 

Vocabulary tests, 170-171 

Vocational guidance and intelligence tests, 
412-417 


ү 


Walker, Helen M., 521 
Webb, L. W., 94, 141, 404 
Wechsler, David, 363, 364, 366, 376 
Wechsler-Bellevue Intelligence Scale, 363- 
368 
adult intelligence, 364-366 
distinctive features, 367-368 
evaluation of, 366-367 
verbal and performance, 364 
Weidemann, C. C., 59, 66 
Wellman, Beth, 376 
Wells, F. L., 373 
Werner, Oscar H., 406 
Wesley, Edgar Bruce, 206 


533. 


Wesley Test, in political terms, 199 
in social terms, 199-200 

West, Paul V., 129, 143 

Whitford, W. G., 298 

Wiedefeld-Walther Geography Test, 189- 
190 

Winnetka Scale for Rating School Behavior, 
490-491 

Wissler, Clark, 377 

Wittenborn, J. R., 446 

Witty, Paul A., 142 

Woodworth, R. S., 373, 423, 448, 465, 
466 

Woodworth Psychoneurotic Inventory, 466 

Woodyard, Ella, 161, 301 

Woolf, Henriette, 94 

Wrightstone, J. Wayne, 142, 197, 464 

Wrightstone Scale of Civic Beliefs, 201-202, 
224 

Wrightstone Test of Critical Thinking, 197- 
199 


Ү 


Yerkes, Robert M., 376 
Yoakum, Clarence S., 426 


Z 
Zapf, Rosalind M., 272 


